Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Munging school grades

Now let's look at applying Categoricals to help us organize information based on categories instead of numbers. The problem we will examine is assigning a letter grade to a student based on their numeric grade.

This data frame represents the raw score for each of the students. Next, we break down numeric grades into letter codes. The following code defines the bins for each grade and the associated letter grade for each bin:

Using these values, we can perform a cut that assigns the letter grade.

Examining the underlying Categorical shows how the following code was created and how the letter grades are related in value:

To determine how many students received each grade, we can use .cat.value_counts():

Since the letter grade Categorical has a logical ordering for the letter grades, we can use it to order the students from highest to lowest letter...