The three unsupervised learning techniques share the same limitation—a high computational complexity.
The K-means has the computational complexity of O(iKnm), where i is the number of iterations (or recursions), K is the number of clusters, n is the number of observations, and m is the number of features. Here are some remedies to the poor performance of the K-means algorithm:
Reducing the average number of iterations by seeding the centroid using a technique such as initialization by ranking the variance of the initial cluster, as described in the beginning of this chapter
Using a parallel implementation of K-means and leveraging a large-scale framework such as Hadoop or Spark
Reducing the number of outliers and features by filtering out the noise with a smoothing algorithm such as a discrete Fourier transform or a Kalman filter
Decreasing the dimensions of the model by following a two-step process:
Execute a first pass with a smaller number of clusters K and...