Weather Data
This is the complete dataset used to train and test our prediction model.
| Day | Outlook | Temp | Humidity | Wind | Play Tennis? |
|---|---|---|---|---|---|
| d1 | Sunny | hot | high | weak | no |
| d2 | Sunny | hot | high | strong | no |
| d3 | Overcast | hot | high | weak | yes |
| d4 | Rainy | mild | high | weak | yes |
| d5 | Rainy | cool | normal | weak | yes |
| d6 | Rainy | cool | normal | strong | no |
| d7 | Overcast | cool | normal | strong | yes |
| d8 | Sunny | mild | high | weak | no |
| d9 | Sunny | cool | normal | weak | yes |
| d10 | Rainy | mild | normal | weak | yes |
| d11 | Sunny | mild | normal | strong | yes |
| d12 | Overcast | mild | high | strong | yes |
| d13 | Overcast | hot | normal | weak | yes |
| d14 | Rainy | mild | high | strong | no |
How the Model is Trained
The model learns from the data using a process called "supervised learning".
Step 1: Train-Test Split
The dataset is split into two parts: a larger Training Set to teach the model, and a smaller Testing Set to evaluate its accuracy. The blue rows in the table above represent the testing data.
Full Dataset
14 records
→
Training Set
10 records
Testing Set
4 records
Step 2: Training the Random Forest
The model is a Random Forest, which is a collection of many individual Decision Trees. Each tree is trained on a random subset of the training data and features. When making a prediction, all trees "vote", and the majority outcome becomes the final prediction. This makes the model more accurate and robust.
Tree 1
Tree 2
Tree 3
...
Many Trees