Book Image

Machine Learning Quick Reference

By : Rahul Kumar
Book Image

Machine Learning Quick Reference

By: Rahul Kumar

Overview of this book

Machine learning makes it possible to learn about the unknowns and gain hidden insights into your datasets by mastering many tools and techniques. This book guides you to do just that in a very compact manner. After giving a quick overview of what machine learning is all about, Machine Learning Quick Reference jumps right into its core algorithms and demonstrates how they can be applied to real-world scenarios. From model evaluation to optimizing their performance, this book will introduce you to the best practices in machine learning. Furthermore, you will also look at the more advanced aspects such as training neural networks and work with different kinds of data, such as text, time-series, and sequential data. Advanced methods and techniques such as causal inference, deep Gaussian processes, and more are also covered. By the end of this book, you will be able to train fast, accurate machine learning models at your fingertips, which you can easily use as a point of reference.
Table of Contents (18 chapters)
Title Page
Copyright and Credits
About Packt
Contributors
Preface
Index

Boosting


When it comes to bagging, it can be applied to both classification and regression. However, there is another technique that is also part of the ensemble family: boosting. However, the underlying principle of these two are quite different. In bagging, each of the models runs independently and then the results are aggregated at the end. This is a parallel operation. Boosting acts in a different way, since it flows sequentially. Each model here runs and passes on the significant features to another model:

Gradient boosting

To explain gradient boosting, we will take the route of Ben Gorman, a great data scientist. He has been able to explain it in a mathematical yet simple way. Let's say that we have got nine training examples wherein we are required to predict the age of a person based on three features, such as whether they like gardening, playing video games, or surfing the internet. The data for this is as follows:

To build this model, the objective is to minimize the mean squared...