In all the methods we've seen so far, every sample or observation has its own target label or value. In some other cases, the dataset is unlabeled and, in order to extract the structure of the data, you need an unsupervised approach. In this section, we're going to introduce two methods to perform clustering, as they are among the most used methods for unsupervised learning.
Note
It is useful to keep in mind that often the terms "clustering" and "unsupervised learning" are considered synonymous, though actually unsupervised learning has a larger meaning.
The first method that we'll introduce, named K-means, is the most commonly used clustering algorithm despite its inevitable shortcomings. In signal processing, K-means is the equivalent of a vectorial quantization, that is, the selection of the best code word (from a given codebook) that better approximates the input observation (or a word).
You must provide the algorithm with the K parameter, which is the...