At first glance, the features in the preceding dataset are categorical, for example, male or female, one of four age groups, one of the predefined site categories, whether or not being interested in sports. Such types of data are different from the numerical type of feature data that we have worked with until now.
Categorical (also called qualitative) features represent characteristics, distinct groups, and a countable number of options. Categorical features may or may not have logical order. For example, household income from low, median to high, is an ordinal feature, while the category of an ad is not ordinal. Numerical (also called quantitative) features, on the other hand, have mathematical meaning as a measurement and of course are ordered. For instance, term frequency and the tf-idf variant are respectively discreet and continuous numerical...