Before moving on, there are two additional linear model topics that we need to discuss. The first is the inclusion of a qualitative feature, and the second is an interaction term; both are explained in the following sections.
A qualitative feature, also referred to as a factor, can take on two or more levels such as Male/Female or Bad/Neutral/Good. If we have a feature with two levels, say gender, then we can create what is known as an indicator or dummy feature, arbitrarily assigning one level as 0
and the other as 1
. If we create a model with just the indicator, our linear model would still follow the same formulation as before, that is, Y = B0 + B1x + e. If we code the feature as male being equal to 0 and female equal to 1, then the expectation for male would just be the intercept B0, while for female it would be B0 + B1x. In the situation where you have more than two levels of the feature, you can create n-1 indicators; so, for three...