Machine Learning Algorithms - Second Edition

Overview of this book

Machine learning has gained tremendous popularity for its powerful and fast predictions with large datasets. However, the true forces behind its powerful output are the complex algorithms involving substantial statistical analysis that churn large datasets and generate substantial insight. This second edition of Machine Learning Algorithms walks you through prominent development outcomes that have taken place relating to machine learning algorithms, which constitute major contributions to the machine learning process and help you to strengthen and master statistical interpretation across the areas of supervised, semi-supervised, and reinforcement learning. Once the core concepts of an algorithm have been covered, you’ll explore real-world examples based on the most diffused libraries, such as scikit-learn, NLTK, TensorFlow, and Keras. You will discover new topics such as principal component analysis (PCA), independent component analysis (ICA), Bayesian regression, discriminant analysis, advanced clustering, and gaussian mixture. By the end of this book, you will have studied machine learning algorithms and be able to put them into production to make your machine learning applications more innovative.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

A Gentle Introduction to Machine Learning

Introduction – classic and adaptive machines

Only learning matters

Beyond machine learning – deep learning and bio-inspired adaptive systems

Machine learning and big data

Summary

Important Elements in Machine Learning

Data formats

Learnability

Introduction to statistical learning concepts

Class balancing

Elements of information theory

Summary

Feature Selection and Feature Engineering

scikit-learn toy datasets

Creating training and test sets

Managing categorical data

Managing missing features

Data scaling and normalization

Feature selection and filtering

Principal Component Analysis

Independent Component Analysis

Atom extraction and dictionary learning

Visualizing high-dimensional datasets using t-SNE

Summary

Regression Algorithms

Linear models for regression

A bidimensional example

Linear regression with scikit-learn and higher dimensionality

Ridge, Lasso, and ElasticNet

Robust regression

Bayesian regression

Polynomial regression

Isotonic regression

Summary

Linear Classification Algorithms

Linear classification

Logistic regression

Implementation and optimizations

Stochastic gradient descent algorithms

Passive-aggressive algorithms

Finding the optimal hyperparameters through a grid search

Classification metrics

ROC curve

Summary

Naive Bayes and Discriminant Analysis

Bayes' theorem

Naive Bayes classifiers

Naive Bayes in scikit-learn

Discriminant analysis

Summary

Support Vector Machines

Linear SVM

SVMs with scikit-learn

Kernel-based classification

ν-Support Vector Machines

Support Vector Regression

Introducing semi-supervised Support Vector Machines (S3VM)

Summary

Decision Trees and Ensemble Learning

Binary Decision Trees

Decision Tree classification with scikit-learn

Decision Tree regression

Introduction to Ensemble Learning

Summary

Clustering Fundamentals

Clustering basics

k-NN

Gaussian mixture

K-means

Evaluation methods based on the ground truth

Summary

Advanced Clustering

DBSCAN

Spectral Clustering

Online Clustering

Biclustering

Summary

Hierarchical Clustering

Hierarchical strategies

Agglomerative Clustering

Summary

Introducing Recommendation Systems

Naive user-based systems

Content-based systems

Model-free (or memory-based) collaborative filtering

Model-based collaborative filtering

Summary

Introducing Natural Language Processing

NLTK and built-in corpora

The Bag-of-Words strategy

Part-of-Speech

A sample text classifier based on the Reuters corpus

Summary

Topic Modeling and Sentiment Analysis in NLP

Topic modeling

Introducing Word2vec with Gensim

Sentiment analysis

Summary

Introducing Neural Networks

Deep learning at a glance

MLPs with Keras

Summary

Advanced Deep Learning Models

Deep model layers

An example of a deep convolutional network with Keras

An example of an LSTM network with Keras

A brief introduction to TensorFlow

Summary

Creating a Machine Learning Architecture

Machine learning architectures

Scikit-learn tools for machine learning architectures

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

What this book covers

Chapter 1, A Gentle Introduction to Machine Learning, introduces the world of machine learning, explaining the fundamental concepts of the most important approaches to creating intelligent applications and focusing on the different kinds of learning methods.

Chapter 2, Important Elements in Machine Learning, explains the mathematical concepts regarding the most common machine learning problems, including the concept of learnability and some important elements of information theory. This chapter contains theoretical elements, but it's extremely helpful if you are learning this topic from scratch because it provides an insight into the most important mathematical tools employed in the majority of algorithms.

Chapter 3, Feature Selection and Feature Engineering, describes the most important techniques for preprocessing a dataset, selecting the most informative features, and reducing the original dimensionality.

Chapter 4, Regression Algorithms, describes the linear regression algorithm and its optimizations: Ridge, Lasso, and ElasticNet. It continues with more advanced models that can be employed to solve non-linear regression problems or to mitigate the effect of outliers.

Chapter 5, Linear Classification Algorithms, introduces the concept of linear classification, focusing on logistic regression, perceptrons, stochastic gradient descent algorithms, and passive-aggressive algorithms. The second part of the chapter covers the most important evaluation metrics, which are used to measure the performance of a model and find the optimal hyperparameter set.

Chapter 6, Naive Bayes and Discriminant Analysis, explains the Bayes probability theory and describes the structure of the most diffused Naive Bayes classifiers. In the second part, linear and quadratic discriminant analysis is analyzed with some concrete examples.

Chapter 7, Support Vector Machines, introduces the SVM family of algorithms, focusing on both linear and non-linear classification problems thanks to the employment of the kernel trick. The last part of the chapter covers support vector regression and more complex classification models.

Chapter 8, Decision Trees and Ensemble Learning, explains the concept of a hierarchical decision process and describes the concepts of decision tree classification, random forests, bootstrapped and bagged trees, and voting classifiers.

Chapter 9, Clustering Fundamentals, introduces the concept of clustering, describing the Gaussian mixture, K-Nearest Neighbors, and K-means algorithms. The last part of the chapter covers different approaches to determining the optimal number of clusters and measuring the performance of a model.

Chapter 10, Advanced Clustering, introduces more complex clustering techniques (DBSCAN, Spectral Clustering, and Biclustering) that can be employed when the dataset structure is non-convex. In the second part of the chapter, two online clustering algorithms (mini-batch K-means and BIRCH) are introduced.

Chapter 11, Hierarchical Clustering, continues the explanation of more complex clustering algorithms started in the previous chapter and introduces the concepts of agglomerative clustering and dendrograms.

Chapter 12, Introducing Recommendation Systems, explains the most diffused algorithms employed in recommender systems: content- and user-based strategies, collaborative filtering, and alternating least square. A complete example based on Apache Spark shows how to process very large datasets using the ALS algorithm.

Chapter 13, Introduction to Natural Language Processing, explains the concept of the Bag-of-Words strategy and introduces the most important techniques required to efficiently process natural language datasets (tokenizing, stemming, stop-word removal, tagging, and vectorizing). An example of a classifier based on the Reuters dataset is also discussed in the last part of the chapter.

Chapter 14, Topic Modeling and Sentiment Analysis in NLP, introduces the concept of topic modeling and describes the most important algorithms, such as latent semantic analysis (both deterministic and probabilistic) and latent Dirichlet allocation. The second part of the chapter covers the problem of word embedding and sentiment analysis, explaining the most diffused approaches to address it.

Chapter 15, Introducing Neural Networks, introduces the world of deep learning, explaining the concept of neural networks and computational graphs. In the second part of the chapter, the high-level deep learning framework Keras is presented with a concrete example of a Multi-layer Perceptron.

Chapter 16, Advanced Deep Learning Models, explains the basic functionalities of the most important deep learning layers, with Keras examples of deep convolutional networks and recurrent (LSTM) networks for time-series processing. In the second part of the chapter, the TensorFlow framework is briefly introduced, along with some examples that expose some of its basic functionalities.

Chapter 17, Creating a Machine Learning Architecture, explains how to define a complete machine learning pipeline, focusing on the peculiarities and drawbacks of each step.

Machine Learning Algorithms - Second Edition

Machine Learning Algorithms - Second Edition

Overview of this book

Related Content you might be interested in

Current Title:

Machine Learning Algorithms - Second Edition

Hands-On Unsupervised Learning with Python

Mastering Machine Learning Algorithms

Mastering Machine Learning Algorithms.