Book Image

Hands-On Unsupervised Learning with Python

By : Giuseppe Bonaccorso
Book Image

Hands-On Unsupervised Learning with Python

By: Giuseppe Bonaccorso

Overview of this book

Unsupervised learning is about making use of raw, untagged data and applying learning algorithms to it to help a machine predict its outcome. With this book, you will explore the concept of unsupervised learning to cluster large sets of data and analyze them repeatedly until the desired outcome is found using Python. This book starts with the key differences between supervised, unsupervised, and semi-supervised learning. You will be introduced to the best-used libraries and frameworks from the Python ecosystem and address unsupervised learning in both the machine learning and deep learning domains. You will explore various algorithms, techniques that are used to implement unsupervised learning in real-world use cases. You will learn a variety of unsupervised learning approaches, including randomized optimization, clustering, feature selection and transformation, and information theory. You will get hands-on experience with how neural networks can be employed in unsupervised scenarios. You will also explore the steps involved in building and training a GAN in order to process images. By the end of this book, you will have learned the art of unsupervised learning for different real-world challenges.
Table of Contents (12 chapters)

What this book covers

Chapter 1, Getting Started with Unsupervised Learning, offers an introduction to machine learning and data science from a very pragmatic perspective. The main concepts are discussed and a few simple examples are shown, focusing attention particularly on unsupervised problem structures.

Chapter 2, Clustering Fundamentals, begins our exploration of clustering algorithms. The most common methods and evaluation metrics are analyzed, together with concrete examples that show how to tune up the hyperparameters and assess performance from different viewpoints.

Chapter 3, Advanced Clustering, discusses some more complex algorithms. Many of the problems analyzed in Chapter 2, Clustering Fundamentals, are re-evaluated using more powerful and flexible methods that can be easily employed whenever the performances of basic algorithms don't meet requirements.

Chapter 4, Hierarchical Clustering in Action, is fully dedicated to a family of algorithms that can calculate a complete clustering hierarchy according to specific criteria. The most common strategies for this are analyzed, together with specific performance measures and algorithmic variants that can increase the effectiveness of the methods.

Chapter 5, Soft Clustering and Gaussian Mixture Models, is focused on a few famous soft-clustering algorithms, with a particular emphasis on Gaussian mixtures, which allow the defining of generative models under quite reasonable assumptions.

Chapter 6, Anomaly Detection, discusses a particular application of unsupervised learning: novelty and outlier detection. The goal is to analyze some common methods that can be effectively employed in order to understand whether a new sample can be considered as valid, or an anomalous one that requires particular attention.

Chapter 7, Dimensionality Reduction and Component Analysis, covers the most common and powerful methods for dimensionality reduction, component analysis, and dictionary learning. The examples show how it's possible to carry out such operations efficiently in different specific scenarios.

Chapter 8, Unsupervised Neural Network Models, discusses some very important unsupervised neural models. In particular, focus is directed both to networks that can learn the structure of a generic data generating process, and to performing dimensionality reduction.

Chapter 9, Generative Adversarial Networks and SOMs, continues the analysis of some deep neural networks that can learn the structure of data generating processes and output new samples drawn from these processes. Moreover, a special kind of network (SOM) is discussed and some practical examples are shown.