In all the methods we've seen so far, every sample or observation has its own target label or value. In some other cases, the dataset is unlabelled and, in order to extract the structure of the data, you need an unsupervised approach. In this section, we're going to introduce two methods to perform clustering, as they are among the most used methods for unsupervised learning.
Keep in mind that often, the terms clustering and unsupervised learning are considered synonymous.
The first method that we'll introduce you to is named K-means. In signal processing, it is the equivalent of a vectorial quantization, that is, the selection of the best codeword (from a given codebook) that better approximates the input observation (or a word).
You must provide the algorithm with the K parameter, which is the number of clusters. Sometimes, this might be a limitation because you have to first investigate which is the right K for the current dataset. K-means iterates...