Understanding the various levels of data is necessary to perform feature engineering. When it comes time to build new features, or fix old ones, we must have ways of identifying how to work with every column.
Here is a quick table to summarize what is and isn't possible at every level:
Level of Measurement | Properties | Examples | Descriptive statistics | Graphs |
Nominal | Discrete Orderless | Binary Responses (True or False) Names of People Colors of paint | Frequencies/Percentages Mode | Bar Pie |
Ordinal | Ordered categories Comparisons | Likert Scales Grades on an exam | Frequencies Mode Median Percentiles | Bar Pie Stem and leaf |
Interval | Differences between ordered values have meaning
| Deg. C or F Some Likert Scales (must be specific) | Frequencies Mode Median Mean Standard Deviation
| Bar PieStem and leaf Box plot Histogram |
Ratio | Continuous True 0 allows ratio statements (for example, $100 is twice as much as $50) | Money Weight | Mean Standard Deviation
| Histogram Box plot |
The following is a table showing the types of statistics allowed...