This chapter shows some unsupervised learning techniques. When facing a business problem, these techniques allow us to identify hidden structures and patterns and perform exploratory data analysis. In addition, unsupervised learning can simplify the problem, allowing us to build more accurate and less elaborated solutions. These techniques can also be used in the solution of the problem itself.
The two branches of techniques are clustering and dimensionality reduction and most of them are not applicable in both the contexts. This chapter shows some popular techniques.
k-means is a centroid-based clustering technique. Given a set of objects, the algorithm identifies k homogeneous clusters. k-means is centroid-based in the sense that each cluster is defined by its centroid representing its average object.
The target of the algorithm is to identify k centroids. Then, the k-means associates each object to the closest centroid, defining k clusters. The algorithm starts...