Book Image

Machine Learning Algorithms

Book Image

Machine Learning Algorithms

Overview of this book

In this book, you will learn all the important machine learning algorithms that are commonly used in the field of data science. These algorithms can be used for supervised as well as unsupervised learning, reinforcement learning, and semi-supervised learning. The algorithms that are covered in this book are linear regression, logistic regression, SVM, naïve Bayes, k-means, random forest, TensorFlow and feature engineering. In this book, you will how to use these algorithms to resolve your problems, and how they work. This book will also introduce you to natural language processing and recommendation systems, which help you to run multiple algorithms simultaneously. On completion of the book, you will know how to pick the right machine learning algorithm for clustering, classification, or regression for your problem
Table of Contents (22 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

Controlled support vector machines


With real datasets, SVM can extract a very large number of support vectors to increase accuracy, and that can slow down the whole process. To allow finding out a trade-off between precision and number of support vectors, scikit-learn provides an implementation called NuSVC, where the parameter nu (bounded between 0—not included—and 1) can be used to control at the same time the number of support vectors (greater values will increase their number) and training errors (lower values reduce the fraction of errors). Let's consider an example with a linear kernel and a simple dataset. In the following figure, there's a scatter plot of our set:

Let's start checking the number of support vectors for a standard SVM:

>>> svc = SVC(kernel='linear') 
>>> svc.fit(X, Y) 
>>> svc.support_vectors_.shape 
(242L, 2L)

So the model has found 242 support vectors. Let's now try to optimize this number using cross-validation. The default value is 0.5,...