Python Machine Learning

Python Machine Learning

By : Sebastian Raschka

Buy this Book

Python Machine Learning

By: Sebastian Raschka

Buy this Book

Overview of this book

Machine learning and predictive analytics are transforming the way businesses and other organizations operate. Being able to understand trends and patterns in complex data is critical to success, becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Python can help you deliver key insights into your data – its unique capabilities as a language let you build sophisticated algorithms and statistical models that can reveal new perspectives and answer key questions that are vital for success. Python Machine Learning gives you access to the world of predictive analytics and demonstrates why Python is one of the world’s leading data science languages. If you want to ask better questions of data, or need to improve and extend the capabilities of your machine learning systems, this practical data science book is invaluable. Covering a wide range of powerful Python libraries, including scikit-learn, Theano, and Keras, and featuring guidance and tips on everything from sentiment analysis to neural networks, you’ll soon be able to answer some of the most important questions facing you and your organization.

Python Machine Learning

Credits

Foreword

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Giving Computers the Ability to Learn from Data

Building intelligent machines to transform data into knowledge

The three different types of machine learning

An introduction to the basic terminology and notations

A roadmap for building machine learning systems

Using Python for machine learning

Summary

Training Machine Learning Algorithms for Classification

Artificial neurons – a brief glimpse into the early history of machine learning

Implementing a perceptron learning algorithm in Python

Adaptive linear neurons and the convergence of learning

Summary

A Tour of Machine Learning Classifiers Using Scikit-learn

Choosing a classification algorithm

First steps with scikit-learn

Modeling class probabilities via logistic regression

Maximum margin classification with support vector machines

Solving nonlinear problems using a kernel SVM

Decision tree learning

K-nearest neighbors – a lazy learning algorithm

Summary

Building Good Training Sets – Data Preprocessing

Dealing with missing data

Handling categorical data

Partitioning a dataset in training and test sets

Bringing features onto the same scale

Selecting meaningful features

Assessing feature importance with random forests

Summary

Compressing Data via Dimensionality Reduction

Unsupervised dimensionality reduction via principal component analysis

Supervised data compression via linear discriminant analysis

Using kernel principal component analysis for nonlinear mappings

Summary

Learning Best Practices for Model Evaluation and Hyperparameter Tuning

Streamlining workflows with pipelines

Using k-fold cross-validation to assess model performance

Debugging algorithms with learning and validation curves

Fine-tuning machine learning models via grid search

Looking at different performance evaluation metrics

Summary

Combining Different Models for Ensemble Learning

Learning with ensembles

Implementing a simple majority vote classifier

Evaluating and tuning the ensemble classifier

Bagging – building an ensemble of classifiers from bootstrap samples

Leveraging weak learners via adaptive boosting

Summary

Applying Machine Learning to Sentiment Analysis

Obtaining the IMDb movie review dataset

Introducing the bag-of-words model

Training a logistic regression model for document classification

Working with bigger data – online algorithms and out-of-core learning

Summary

Embedding a Machine Learning Model into a Web Application

Serializing fitted scikit-learn estimators

Setting up a SQLite database for data storage

Developing a web application with Flask

Turning the movie classifier into a web application

Deploying the web application to a public server

Summary

Predicting Continuous Target Variables with Regression Analysis

Introducing a simple linear regression model

Exploring the Housing Dataset

Implementing an ordinary least squares linear regression model

Fitting a robust regression model using RANSAC

Evaluating the performance of linear regression models

Using regularized methods for regression

Turning a linear regression model into a curve – polynomial regression

Summary

Working with Unlabeled Data – Clustering Analysis

Grouping objects by similarity using k-means

Organizing clusters as a hierarchical tree

Locating regions of high density via DBSCAN

Summary

Training Artificial Neural Networks for Image Recognition

Modeling complex functions with artificial neural networks

Classifying handwritten digits

Training an artificial neural network

Developing your intuition for backpropagation

Debugging neural networks with gradient checking

Convergence in neural networks

Other neural network architectures

A few last words about neural network implementation

Summary

Parallelizing Neural Network Training with Theano

Building, compiling, and running expressions with Theano

Choosing activation functions for feedforward neural networks

Training neural networks efficiently using Keras

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Introducing the bag-of-words model

We remember from Chapter 4, Building Good Training Sets – Data Preprocessing, that we have to convert categorical data, such as text or words, into a numerical form before we can pass it on to a machine learning algorithm. In this section, we will introduce the bag-of-words model that allows us to represent text as numerical feature vectors. The idea behind the bag-of-words model is quite simple and can be summarized as follows:

We create a vocabulary of unique tokens—for example, words—from the entire set of documents.
We construct a feature vector from each document that contains the counts of how often each word occurs in the particular document.

Since the unique words in each document represent only a small subset of all the words in the bag-of-words vocabulary, the feature vectors will consist of mostly zeros, which is why we call them sparse. Do not worry if this sounds too abstract; in the following subsections, we will walk through the process of creating...

Python Machine Learning

By : Sebastian Raschka

Python Machine Learning

By: Sebastian Raschka

Overview of this book

Related Content you might be interested in

Current Title:

Python Machine Learning

Introducing the bag-of-words model