Python Machine Learning

Python Machine Learning

By : Sebastian Raschka

Buy this Book

Python Machine Learning

By: Sebastian Raschka

Buy this Book

Overview of this book

Machine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Python can help you deliver key insights into your data – its unique capabilities as a language let you build sophisticated algorithms and statistical models that can reveal new perspectives and answer key questions that are vital for success. Python Machine Learning gives you access to the world of predictive analytics and demonstrates why Python is one of the world’s leading data science languages. If you want to ask better questions of data, or need to improve and extend the capabilities of your machine learning systems, this practical data science book is invaluable. Covering a wide range of powerful Python libraries, including scikit-learn, Theano, and Keras, and featuring guidance and tips on everything from sentiment analysis to neural networks, you’ll soon be able to answer some of the most important questions facing you and your organization.

Python Machine Learning

Credits

Foreword

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Giving Computers the Ability to Learn from Data

Building intelligent machines to transform data into knowledge

The three different types of machine learning

An introduction to the basic terminology and notations

A roadmap for building machine learning systems

Using Python for machine learning

Summary

Training Machine Learning Algorithms for Classification

Artificial neurons – a brief glimpse into the early history of machine learning

Implementing a perceptron learning algorithm in Python

Adaptive linear neurons and the convergence of learning

Summary

A Tour of Machine Learning Classifiers Using Scikit-learn

Choosing a classification algorithm

First steps with scikit-learn

Modeling class probabilities via logistic regression

Maximum margin classification with support vector machines

Solving nonlinear problems using a kernel SVM

Decision tree learning

K-nearest neighbors – a lazy learning algorithm

Summary

Building Good Training Sets – Data Preprocessing

Dealing with missing data

Handling categorical data

Partitioning a dataset in training and test sets

Bringing features onto the same scale

Selecting meaningful features

Assessing feature importance with random forests

Summary

Compressing Data via Dimensionality Reduction

Unsupervised dimensionality reduction via principal component analysis

Supervised data compression via linear discriminant analysis

Using kernel principal component analysis for nonlinear mappings

Summary

Learning Best Practices for Model Evaluation and Hyperparameter Tuning

Streamlining workflows with pipelines

Using k-fold cross-validation to assess model performance

Debugging algorithms with learning and validation curves

Fine-tuning machine learning models via grid search

Looking at different performance evaluation metrics

Summary

Combining Different Models for Ensemble Learning

Learning with ensembles

Implementing a simple majority vote classifier

Evaluating and tuning the ensemble classifier

Bagging – building an ensemble of classifiers from bootstrap samples

Leveraging weak learners via adaptive boosting

Summary

Applying Machine Learning to Sentiment Analysis

Obtaining the IMDb movie review dataset

Introducing the bag-of-words model

Training a logistic regression model for document classification

Working with bigger data – online algorithms and out-of-core learning

Summary

Embedding a Machine Learning Model into a Web Application

Serializing fitted scikit-learn estimators

Setting up a SQLite database for data storage

Developing a web application with Flask

Turning the movie classifier into a web application

Deploying the web application to a public server

Summary

Predicting Continuous Target Variables with Regression Analysis

Introducing a simple linear regression model

Exploring the Housing Dataset

Implementing an ordinary least squares linear regression model

Fitting a robust regression model using RANSAC

Evaluating the performance of linear regression models

Using regularized methods for regression

Turning a linear regression model into a curve – polynomial regression

Summary

Working with Unlabeled Data – Clustering Analysis

Grouping objects by similarity using k-means

Organizing clusters as a hierarchical tree

Locating regions of high density via DBSCAN

Summary

Training Artificial Neural Networks for Image Recognition

Modeling complex functions with artificial neural networks

Classifying handwritten digits

Training an artificial neural network

Developing your intuition for backpropagation

Debugging neural networks with gradient checking

Convergence in neural networks

Other neural network architectures

A few last words about neural network implementation

Summary

Parallelizing Neural Network Training with Theano

Building, compiling, and running expressions with Theano

Choosing activation functions for feedforward neural networks

Training neural networks efficiently using Keras

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Summary

In this chapter, we explored machine learning on a very high level and familiarized ourselves with the big picture and major concepts that we are going to explore in the next chapters in more detail.

We learned that supervised learning is composed of two important subfields: classification and regression. While classification models allow us to categorize objects into known classes, we can use regression analysis to predict the continuous outcomes of target variables. Unsupervised learning not only offers useful techniques for discovering structures in unlabeled data, but it can also be useful for data compression in feature preprocessing steps.

We briefly went over the typical roadmap for applying machine learning to problem tasks, which we will use as a foundation for deeper discussions and hands-on examples in the following chapters. Eventually, we set up our Python environment and installed and updated the required packages to get ready to see machine-learning in action.

In the following chapter, we will implement one of the earliest machine learning algorithms for classification that will prepare us for Chapter 3, A Tour of Machine Learning Classifiers Using Scikit-learn, where we cover more advanced machine learning algorithms using the scikit-learn open source machine learning library. Since machine learning algorithms learn from data, it is critical that we feed them useful information, and in Chapter 4, Building Good Training Sets—Data Preprocessing we will take a look at important data preprocessing techniques. In Chapter 5, Compressing Data via Dimensionality Reduction, we will learn about dimensionality reduction techniques that can help us to compress our dataset onto a lower-dimensional feature subspace, which can be beneficial for computational efficiency. An important aspect of building machine learning models is to evaluate their performance and to estimate how well they can make predictions on new, unseen data. In Chapter 6, Learning Best Practices for Model Evaluation and Hyperparameter Tuning we will learn all about the best practices for model tuning and evaluation. In certain scenarios, we still may not be satisfied with the performance of our predictive model although we may have spent hours or days extensively tuning and testing. In Chapter 7, Combining Different Models for Ensemble Learning we will learn how to combine different machine learning models to build even more powerful predictive systems.

After we covered all of the important concepts of a typical machine learning pipeline, we will implement a model for predicting emotions in text in Chapter 8, Applying Machine Learning to Sentiment Analysis, and in Chapter 9, Embedding a Machine Learning Model into a Web Application, we will embed it into a Web application to share it with the world. In Chapter 10, Predicting Continuous Target Variables with Regression Analysis we will then use machine learning algorithms for regression analysis that allow us to predict continuous output variables, and in Chapter 11, Working with Unlabelled Data – Clustering Analysis we will apply clustering algorithms that will allow us to find hidden structures in data. The last two chapters in this book will cover artificial neural networks that will allow us to tackle complex problems, such as image and speech recognition, which is currently one of the hottest topics in machine-learning research.

Python Machine Learning

By : Sebastian Raschka

Python Machine Learning

By: Sebastian Raschka

Overview of this book

Related Content you might be interested in

Current Title:

Python Machine Learning

Summary