Book Image

Microsoft Azure Machine Learning

By : Sumit Mund, Christina Storm
Book Image

Microsoft Azure Machine Learning

By: Sumit Mund, Christina Storm

Overview of this book

Table of Contents (21 chapters)
Microsoft Azure Machine Learning
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Model development


You have to predict whether the flight would be delayed or not. As you found from the dataset, any flight delayed for more than 15 minutes has been labeled as delayed and the ArrDelay15 corresponding label contains 1. Here, the ArrDelay15 column is the target variable and it only contains 0 and 1. Clearly, it's a two-class classification problem.

As you have already explored, there are several two-class classification algorithms available in ML Studio. For simplicity, we would just build the model here with the Two-Class Boosted Decision Tree module with the following parameters:

  • The Maximum number of leaves per tree option is set at 128

  • The Minimum number of samples per leaf node option is set at 50

  • The Learning rate option is set at 0.2

  • The Number of trees constructed option is set at 500

You are encouraged to try out different algorithms and also use the Sweep Parameters module to choose the optimum parameters.

To train the model, you need to split the dataset and use one...