Python Machine Learning, Second Edition

Python Machine Learning, Second Edition - Second Edition

By : Sebastian Raschka, Vahid Mirjalili

Buy this Book

Python Machine Learning, Second Edition - Second Edition

By: Sebastian Raschka, Vahid Mirjalili

Buy this Book

Overview of this book

Publisher's Note: This edition from 2017 is outdated and is not compatible with TensorFlow 2 or any of the most recent updates to Python libraries. A new third edition, updated for 2020 and featuring TensorFlow 2 and the latest in scikit-learn, reinforcement learning, and GANs, has now been published. Machine learning is eating the software world, and now deep learning is extending machine learning. Understand and work at the cutting edge of machine learning, neural networks, and deep learning with this second edition of Sebastian Raschka’s bestselling book, Python Machine Learning. Using Python's open source libraries, this book offers the practical knowledge and techniques you need to create and contribute to machine learning, deep learning, and modern data analysis. Fully extended and modernized, Python Machine Learning Second Edition now includes the popular TensorFlow 1.x deep learning library. The scikit-learn code has also been fully updated to v0.18.1 to include improvements and additions to this versatile machine learning library. Sebastian Raschka and Vahid Mirjalili’s unique insight and expertise introduce you to machine learning and deep learning algorithms from scratch, and show you how to apply them to practical industry challenges using realistic and interesting examples. By the end of the book, you’ll be ready to meet the new data analysis opportunities. If you’ve read the first edition of this book, you’ll be delighted to find a balance of classical ideas and modern insights into machine learning. Every chapter has been critically updated, and there are new chapters on key technologies. You’ll be able to learn and work with TensorFlow 1.x more deeply than ever before, and get essential coverage of the Keras neural network library, along with updates to scikit-learn 0.18.1.

Python Machine Learning Second Edition

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Packt is Searching for Authors Like You

Preface

Free Chapter

Giving Computers the Ability to Learn from Data

Building intelligent machines to transform data into knowledge

The three different types of machine learning

Introduction to the basic terminology and notations

A roadmap for building machine learning systems

Using Python for machine learning

Summary

Training Simple Machine Learning Algorithms for Classification

Artificial neurons – a brief glimpse into the early history of machine learning

Implementing a perceptron learning algorithm in Python

Adaptive linear neurons and the convergence of learning

Summary

A Tour of Machine Learning Classifiers Using scikit-learn

Choosing a classification algorithm

First steps with scikit-learn – training a perceptron

Modeling class probabilities via logistic regression

Maximum margin classification with support vector machines

Solving nonlinear problems using a kernel SVM

Decision tree learning

K-nearest neighbors – a lazy learning algorithm

Summary

Building Good Training Sets – Data Preprocessing

Dealing with missing data

Handling categorical data

Partitioning a dataset into separate training and test sets

Bringing features onto the same scale

Selecting meaningful features

Assessing feature importance with random forests

Summary

Compressing Data via Dimensionality Reduction

Unsupervised dimensionality reduction via principal component analysis

Supervised data compression via linear discriminant analysis

Using kernel principal component analysis for nonlinear mappings

Summary

Learning Best Practices for Model Evaluation and Hyperparameter Tuning

Streamlining workflows with pipelines

Using k-fold cross-validation to assess model performance

Debugging algorithms with learning and validation curves

Fine-tuning machine learning models via grid search

Looking at different performance evaluation metrics

Dealing with class imbalance

Summary

Combining Different Models for Ensemble Learning

Learning with ensembles

Combining classifiers via majority vote

Bagging – building an ensemble of classifiers from bootstrap samples

Leveraging weak learners via adaptive boosting

Summary

Applying Machine Learning to Sentiment Analysis

Preparing the IMDb movie review data for text processing

Introducing the bag-of-words model

Training a logistic regression model for document classification

Working with bigger data – online algorithms and out-of-core learning

Topic modeling with Latent Dirichlet Allocation

Summary

Embedding a Machine Learning Model into a Web Application

Serializing fitted scikit-learn estimators

Setting up an SQLite database for data storage

Developing a web application with Flask

Turning the movie review classifier into a web application

Deploying the web application to a public server

Summary

Predicting Continuous Target Variables with Regression Analysis

Introducing linear regression

Exploring the Housing dataset

Implementing an ordinary least squares linear regression model

Fitting a robust regression model using RANSAC

Evaluating the performance of linear regression models

Using regularized methods for regression

Turning a linear regression model into a curve – polynomial regression

Dealing with nonlinear relationships using random forests

Summary

Working with Unlabeled Data – Clustering Analysis

Grouping objects by similarity using k-means

Organizing clusters as a hierarchical tree

Locating regions of high density via DBSCAN

Summary

Implementing a Multilayer Artificial Neural Network from Scratch

Modeling complex functions with artificial neural networks

Classifying handwritten digits

Training an artificial neural network

About the convergence in neural networks

A few last words about the neural network implementation

Summary

Parallelizing Neural Network Training with TensorFlow

TensorFlow and training performance

Training neural networks efficiently with high-level TensorFlow APIs

Choosing activation functions for multilayer networks

Summary

Going Deeper – The Mechanics of TensorFlow

Key features of TensorFlow

TensorFlow ranks and tensors

Understanding TensorFlow's computation graphs

Placeholders in TensorFlow

Variables in TensorFlow

Building a regression model

Executing objects in a TensorFlow graph using their names

Saving and restoring a model in TensorFlow

Transforming Tensors as multidimensional data arrays

Utilizing control flow mechanics in building graphs

Visualizing the graph with TensorBoard

Summary

Classifying Images with Deep Convolutional Neural Networks

Building blocks of convolutional neural networks

Putting everything together to build a CNN

Implementing a deep convolutional neural network using TensorFlow

Summary

Modeling Sequential Data Using Recurrent Neural Networks

Introducing sequential data

RNNs for modeling sequences

Implementing a multilayer RNN for sequence modeling in TensorFlow

Project one – performing sentiment analysis of IMDb movie reviews using multilayer RNNs

Project two – implementing an RNN for character-level language modeling in TensorFlow

Chapter and book summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

A roadmap for building machine learning systems

In previous sections, we discussed the basic concepts of machine learning and the three different types of learning. In this section, we will discuss the other important parts of a machine learning system accompanying the learning algorithm. The following diagram shows a typical workflow for using machine learning in predictive modeling, which we will discuss in the following subsections:

Preprocessing – getting data into shape

Let's begin with discussing the roadmap for building machine learning systems. Raw data rarely comes in the form and shape that is necessary for the optimal performance of a learning algorithm. Thus, the preprocessing of the data is one of the most crucial steps in any machine learning application. If we take the Iris flower dataset from the previous section as an example, we can think of the raw data as a series of flower images from which we want to extract meaningful features. Useful features could be the color, the hue, the intensity of the flowers, the height, and the flower lengths and widths. Many machine learning algorithms also require that the selected features are on the same scale for optimal performance, which is often achieved by transforming the features in the range [0, 1] or a standard normal distribution with zero mean and unit variance, as we will see in later chapters.

Some of the selected features may be highly correlated and therefore redundant to a certain degree. In those cases, dimensionality reduction techniques are useful for compressing the features onto a lower dimensional subspace. Reducing the dimensionality of our feature space has the advantage that less storage space is required, and the learning algorithm can run much faster. In certain cases, dimensionality reduction can also improve the predictive performance of a model if the dataset contains a large number of irrelevant features (or noise), that is, if the dataset has a low signal-to-noise ratio.

To determine whether our machine learning algorithm not only performs well on the training set but also generalizes well to new data, we also want to randomly divide the dataset into a separate training and test set. We use the training set to train and optimize our machine learning model, while we keep the test set until the very end to evaluate the final model.

Training and selecting a predictive model

As we will see in later chapters, many different machine learning algorithms have been developed to solve different problem tasks. An important point that can be summarized from David Wolpert's famous No free lunch theorems is that we can't get learning "for free" (The Lack of A Priori Distinctions Between Learning Algorithms, D.H. Wolpert 1996; No free lunch theorems for optimization, D.H. Wolpert and W.G. Macready, 1997). Intuitively, we can relate this concept to the popular saying, I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail (Abraham Maslow, 1966). For example, each classification algorithm has its inherent biases, and no single classification model enjoys superiority if we don't make any assumptions about the task. In practice, it is therefore essential to compare at least a handful of different algorithms in order to train and select the best performing model. But before we can compare different models, we first have to decide upon a metric to measure performance. One commonly used metric is classification accuracy, which is defined as the proportion of correctly classified instances.

One legitimate question to ask is this: how do we know which model performs well on the final test dataset and real-world data if we don't use this test set for the model selection, but keep it for the final model evaluation? In order to address the issue embedded in this question, different cross-validation techniques can be used where the training dataset is further divided into training and validation subsets in order to estimate the generalization performance of the model. Finally, we also cannot expect that the default parameters of the different learning algorithms provided by software libraries are optimal for our specific problem task. Therefore, we will make frequent use of hyperparameter optimization techniques that help us to fine-tune the performance of our model in later chapters. Intuitively, we can think of those hyperparameters as parameters that are not learned from the data but represent the knobs of a model that we can turn to improve its performance. This will become much clearer in later chapters when we see actual examples.

Evaluating models and predicting unseen data instances

After we have selected a model that has been fitted on the training dataset, we can use the test dataset to estimate how well it performs on this unseen data to estimate the generalization error. If we are satisfied with its performance, we can now use this model to predict new, future data. It is important to note that the parameters for the previously mentioned procedures, such as feature scaling and dimensionality reduction, are solely obtained from the training dataset, and the same parameters are later reapplied to transform the test dataset, as well as any new data samples—the performance measured on the test data may be overly optimistic otherwise.

Python Machine Learning, Second Edition - Second Edition

By : Sebastian Raschka, Vahid Mirjalili

Python Machine Learning, Second Edition - Second Edition

By: Sebastian Raschka, Vahid Mirjalili

Overview of this book

Related Content you might be interested in

Current Title:

Python Machine Learning, Second Edition - Second Edition

Mastering Machine Learning with scikit-learn

Machine Learning for OpenCV

Machine Learning for OpenCV 4.

A roadmap for building machine learning systems

Preprocessing – getting data into shape

Training and selecting a predictive model

Evaluating models and predicting unseen data instances