Book Image

Pandas 1.x Cookbook - Second Edition

By : Matt Harrison, Theodore Petrou
Book Image

Pandas 1.x Cookbook - Second Edition

By: Matt Harrison, Theodore Petrou

Overview of this book

The pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This book guides you, as if you were looking over the shoulder of an expert, through situations that you are highly likely to encounter. This new updated and revised edition provides you with unique, idiomatic, and fun recipes for both fundamental and advanced data manipulation tasks with pandas. Some recipes focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. Other recipes will dive deep into a particular dataset, uncovering new and unexpected insights along the way. Many advanced recipes combine several different features across the pandas library to generate results.
Table of Contents (17 chapters)
15
Other Books You May Enjoy
16
Index

Data dictionaries

A crucial part of data analysis involves creating and maintaining a data dictionary. A data dictionary is a table of metadata and notes on each column of data. One of the primary purposes of a data dictionary is to explain the meaning of the column names. The college dataset uses a lot of abbreviations that are likely to be unfamiliar to an analyst who is inspecting it for the first time.

A data dictionary for the college dataset is provided in the following college_data_dictionary.csv file:

>>> pd.read_csv("data/college_data_dictionary.csv")
    column_name  description
0        INSTNM  Institut...
1          CITY  City Loc...
2        STABBR  State Ab...
3          HBCU  Historic...
4       MENONLY  0/1 Men ...
..          ...          ...
22      PCTPELL  Percent ...
23     PCTFLOAN  Percent ...
24      UG25ABV  Percent ...
25  MD_EARN_...  Median E...
26  GRAD_DEB...  Median d...

As you can see, it is immensely helpful in deciphering...