-
Book Overview & Buying
-
Table Of Contents
Learning Predictive Analytics with Python
By :
Creating dummy variables is a method to create separate variable for each category of a categorical variable., Although, the categorical variable contains plenty of information and might show a causal relationship with output variable, it can't be used in the predictive models like linear and logistic regression without any processing.
In our dataset, sex is a categorical variable with two categories that are male and female. We can create two dummy variables out of this, as follows:
dummy_sex=pd.get_dummies(data['sex'],prefix='sex')
The result of this statement is, as follows:

Fig. 2.17: Dummy variable for the sex variable in the Titanic dataset
This process is called dummifying, the variable creates two new variables that take either 1 or 0 value depending on what the sex of the passenger was. If the sex was female, sex_female would be 1 and sex_male would be 0. If the sex was male, sex_male would be 1 and sex_female would be 0. In general, all but one dummy variable...
Change the font size
Change margin width
Change background colour