Up until now we have only used and worked on data that was prelabeled that is, supervised. Based on that prelabeled data, we trained our machine learning models and predicted our results. But what if the data is not labeled at all and we just get plain data? In that case, can we carry out any useful analysis of the data at all? Figuring out details from an unlabeled dataset is an example of unsupervised learning, where the machine learning algorithm makes deductions or predictions from raw unlabeled data. One of the most popular approaches to analyzing this unlabeled data is to find groups of similar items within a dataset. This grouping of data has several advantages and use cases, as we will see in this chapter.
In this chapter, we will cover the following topics:
The concepts of clustering and types of clustering, including k-means and bisecting k-means clustering
Advantages and use cases of clustering
Customer segmentation and...