Before we give a formal definition of the EM algorithm, let's discuss some basics about likelihood and maximum likelihood. This is necessary in order to understand the definitions. After satisfying these prerequisites, we will give a formal definition of the EM algorithm according to the problems of missing values imputation. This is where the EM algorithm originates (Dempster, Laird, and Rubin 1977). We will see that the complex formal definition is easy to catch with an introductory example.
Before we start with the EM algorithm we need to remind ourselves of some of the basics about likelihood and maximum likelihood.
For this we start by flipping a coin. If we have two possible outcomes, event A ('head') and A' ('tail'), our parameter of interest is , the probability of tossing the heads side of the coin. The probability of tossing the tails side of the coin is then .
Let's assume that we tossed the coin 10 times and the results were head, tail, head...