Generating frequencies for categorical variables
Many years ago, a very seasoned researcher said to me, "90% of what we're going to find, we'll see in the frequency distributions." That message has stayed with me. The more one-way and two-way frequency distributions (crosstabs) I do on a DataFrame, the better I understand it. We will do one-way distributions in this recipe, and crosstabs in subsequent recipes.
We continue our work with the NLS. We will also be doing a fair bit of column selection using
filter methods. It is not necessary to review the recipe in this chapter on column selection, but it might be helpful.
How to do it…
We use pandas tools to generate frequencies, particularly the very handy
- Load the
pandaslibrary and the
Also, convert the columns with object data type to category data type:
>>> import pandas as pd >>> nls97 = pd.read_csv("data/nls97.csv...