The k-means algorithm is a flat clustering algorithm. It works as follows:
Set the value of K.
Choose K data points from the dataset that are initial centers of the individual clusters.
Calculate the distance of each data point to the chosen center points, and group each point in the cluster whose initial center is the closest to the data point.
Once all of the points are in one of the K clusters, calculate the center point of each cluster. This center point does not have to be an existing data point in the dataset; it is just an average.
Repeat this process of assigning each data point into the cluster that has a center closest to the data point. Repetition continues until the center points no longer move.
To make sure that the k-means algorithm terminates, we need the following:
A maximum level of tolerance when we exit in case the centroids move less than the tolerance value
A maximum number of repetitions of shifting the moving points
Due to the nature of the k-means algorithm...