Book Image

Designing Machine Learning Systems with Python

By : David Julian
Book Image

Designing Machine Learning Systems with Python

By: David Julian

Overview of this book

Machine learning is one of the fastest growing trends in modern computing. It has applications in a wide range of fields, including economics, the natural sciences, web development, and business modeling. In order to harness the power of these systems, it is essential that the practitioner develops a solid understanding of the underlying design principles. There are many reasons why machine learning models may not give accurate results. By looking at these systems from a design perspective, we gain a deeper understanding of the underlying algorithms and the optimisational methods that are available. This book will give you a solid foundation in the machine learning design process, and enable you to build customised machine learning models to solve unique problems. You may already know about, or have worked with, some of the off-the-shelf machine learning models for solving common problems such as spam detection or movie classification, but to begin solving more complex problems, it is important to adapt these models to your own specific needs. This book will give you this understanding and more.
Table of Contents (16 chapters)
Designing Machine Learning Systems with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Free Chapter
1
Thinking in Machine Learning
Index

Scikit-learn


This includes algorithms for the most common machine learning tasks, such as classification, regression, clustering, dimensionality reduction, model selection, and preprocessing.

Scikit-learn comes with several real-world data sets for us to practice with. Let's take a look at one of these—the Iris data set:

from sklearn import datasets
iris = datasets.load_iris()
iris_X = iris.data
iris_y = iris.target
iris_X.shape
(150, 4)

The data set contains 150 samples of three types of irises (Setosa, Versicolor, and Virginica), each with four features. We can get a description on the dataset:

iris.DESCR

We can see that the four attributes, or features, are sepal width, sepal length, petal length, and petal width in centimeters. Each sample is associated with one of three classes. Setosa, Versicolor, and Virginica. These are represented by 0, 1, and 2 respectively.

Let's look at a simple classification problem using this data. We want to predict the type of iris based on its features: the...