In this section, we will cover the different clustering methods. First, let's look at what clustering is. Then we'll explain some of the mathematical tricks that we can use in clustering. And finally, we're going to introduce our newest non-parametric algorithm KNN.
Clustering is about as intuitive as it gets in terms of machine learning models. The idea is we can segment groups of samples based on their nearness to one another. The hypothesis is the samples that are closer are more similar in some respects. So, there are two reasons we might want to cluster. The first is for discovery purposes, and we usually do this when we make no assumptions about the underlying structure of the data, or don't have labels. And so, this typically is done in a purely unsupervised sense. But as this is obviously a supervised learning book, we're going to focus on the second use case, which uses clustering as a classification technique.