Book Image

Applied Unsupervised Learning with Python

By : Benjamin Johnston, Aaron Jones, Christopher Kruger
Book Image

Applied Unsupervised Learning with Python

By: Benjamin Johnston, Aaron Jones, Christopher Kruger

Overview of this book

Unsupervised learning is a useful and practical solution in situations where labeled data is not available. Applied Unsupervised Learning with Python guides you in learning the best practices for using unsupervised learning techniques in tandem with Python libraries and extracting meaningful information from unstructured data. The book begins by explaining how basic clustering works to find similar data points in a set. Once you are well-versed with the k-means algorithm and how it operates, you’ll learn what dimensionality reduction is and where to apply it. As you progress, you’ll learn various neural network techniques and how they can improve your model. While studying the applications of unsupervised learning, you will also understand how to mine topics that are trending on Twitter and Facebook and build a news recommendation engine for users. Finally, you will be able to put your knowledge to work through interesting activities such as performing a Market Basket Analysis and identifying relationships between different products. By the end of this book, you will have the skills you need to confidently build your own models using Python.
Table of Contents (12 chapters)
Applied Unsupervised Learning with Python
Preface

Clustering Refresher


Chapter 1, Introduction to Clustering, covered both the high-level intuition and in-depth details of one of the most basic clustering algorithms: k-means. While it is indeed a simple approach, do not discredit it; it will be a valuable addition to your toolkit as you continue your exploration of the unsupervised learning world. In many real-world use cases, companies experience groundbreaking discoveries through the simplest methods, such as k-means or linear regression (for supervised learning). As a refresher, let's quickly walk through what clusters are and how k-means works to find them:

Figure 2.1: The attributes that separate supervised and unsupervised problems

If you were given a random collection of data without any guidance, you would likely start your exploration using basic statistics – for example, what the mean, median, and mode values are of each of the features. Remember that, from a high-level data model that simply exists, knowing whether it is supervised...