Cluster analysis is the process of grouping objects together in a way that objects in one group are more similar than objects in other groups.
An example would be identifying and grouping clients with similar booking activities on a travel portal, as shown in the following figure.
In the preceding example, each group is called a cluster, and each member (data point) of the cluster behaves in a manner similar to its group members.
Cluster analysis is an unsupervised learning method. In supervised methods, such as regression analysis, we have input variables and response variables. We fit a statistical model to the input variables to predict the response variable. Whereas in unsupervised learning methods, however, we do not have any response variable to predict; we only have input variables. Instead of fitting a model to the input variables to predict the response variable, we just try to find patterns within the dataset. There are three popular clustering algorithms...