Book Image

Mastering Machine Learning with scikit-learn

By : Gavin Hackeling
Book Image

Mastering Machine Learning with scikit-learn

By: Gavin Hackeling

Overview of this book

<p>This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features.</p> <p>You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models. The book will also walk you through an example project that prompts you to label the most uncertain training examples. You will also use an unsupervised Hidden Markov Model to predict stock prices.</p> <p>By the end of the book, you will be an expert in scikit-learn and will be well versed in machine learning.</p>
Table of Contents (17 chapters)
Mastering Machine Learning with scikit-learn
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Kernels and the kernel trick


Recall that the perceptron separates the instances of the positive class from the instances of the negative class using a hyperplane as a decision boundary. The decision boundary is given by the following equation:

Predictions are made using the following function:

Note that previously we expressed the inner product as . To be consistent with the notational conventions used for support vector machines, we will adopt the former notation in this chapter.

While the proof is beyond the scope of this chapter, we can write the model differently. The following expression of the model is called the dual form. The expression we used previously is the primal form:

The most important difference between the primal and dual forms is that the primal form computes the inner product of the model parameters and the test instance's feature vector, while the dual form computes the inner product of the training instances and the test instance's feature vector. Shortly, we will exploit...