The Expectation Maximization (EM) algorithm is a probabilistic-model-based clustering algorithm that depends on the mixture model in which the data is modeled by a mixture of simple models. The parameters related to these models are estimated by Maximum Likelihood Estimation (MLE).
Mixture models assume that the data is the result of the combination of various simple probabilistic distribution functions. Given K distribution functions and the jth distribution with the parameter, , is the set of of all distributions:
The EM algorithm performs in the following way. In the first step, an initial group of model parameters are selected. The expectation step is the second step that performs the calculation of the probability:
The previous equation represents the probability of each data object belonging to each distribution. Maximization is the third step. With the result of the expectation step, update the estimation of the parameters with the ones that...