# K-means

When we discussed the Gaussian mixture algorithm, we defined it as soft K-means. The reason is that each cluster was represented by three elements: mean, variance, and weight. Each sample always belongs to all clusters with a probability provided by the Gaussian distributions. This approach can be very useful when it's possible to manage the probabilities as weights, but in many other situations, it's preferable to determine a single cluster per sample.

Such an approach is called hard clustering and K-means can be considered the hard version of a Gaussian mixture. In fact, when all variances , the distributions degenerate to Dirac deltas , which represent perfect spikes centered at a specific point (even if they are not real functions but distributions). In this scenario, the only possibility to determine the most appropriate cluster is to find the shortest distance between a sample point and all the centers (from now on, we are going to call them centroids...