-
Book Overview & Buying
-
Table Of Contents
Python Data Cleaning Cookbook
By :
During the first few days of working with a DataFrame, we try to get a good sense of the distribution of continuous variables and counts for categorical variables. We also often do counts by selected groups. Although pandas and NumPy have many built-in methods for these purposes – describe, mean, valuecounts, crosstab, and so on – data analysts often have preferences for how they work with these tools. If, for example, an analyst finds that she usually needs to see more percentiles than those generated by describe, she can use her own function instead. We will create user-defined functions for displaying summary statistics and frequencies in this recipe.
We will be working with the basicdescriptives module again in this recipe. All of the functions we will define are saved in that module. We continue to work with the NLS data.
We will use functions we create to generate...