The mixture model is a model of a larger distribution family called latent variable models, in which some of the variables are not observed at all. The reason is usually to simplify the model by grouping all the variables into subgroups with a different meaning. Another reason is also to introduce a hidden process into the model, the real reason for the data generation process. In other words, we assume that we have a set of models and something hidden will select one of these models, and then generate a data point from the selected model.
When the data naturally exhibits clusters, it seems reasonable to say that each cluster is a small model.
The whole problem is then to find to what extent a submodel will participate in the data generation process and what the parameters for each sub model are. This is usually solved using the EM algorithm.
There are many ways to combine small models in order to make a bigger or more generic model. The approach generally used in mixture modeling...