8.2 BAYES THEOREM
Consider a data set made up of two predictors X = X1, X2 and a response variable Y, where the response variable takes one of three possible class values: y1, y2, and y3 Our objective is to identify which of y1, y2, and y3 is the most likely for a particular combination of predictor variable values. Let us call this most likely combination X* = {X1 = x1, X2 = x2}.
We can use Bayes Theorem to identify which class is the most likely for a particular combination of predictor variable values by:
- calculating the posterior probability for each of y1, y2, and y3, for the combination of predictors x1 and x2 and
- selecting the value of y with the highest posterior probability.
Let y* be one of the three potential values of Y. Bayes Theorem tells us:
(8.1) p ( Y = y * | X * ) = p ( X * | Y = y * ) p ( Y = y * ) p ( X * ) --
Now, p(Y = y*) represents the knowledge we have about how likely the class value y* is, before we even begin...