Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Categorical Data

A categorical variable is a type of variable in statistics that represents a limited and often fixed set of values. This is in contrast to continuous variables, which can represent an infinite number of values. Common types of categorical variables include gender (where there are two values, male and female) or blood types (which can be one of the small sets of types of blood, such as A, B, and O).

pandas has the ability to represent Categorical variables using a type of pandas object known as Categorical. These pandas objects are designed to efficiently represent data that is grouped into a set of buckets, each represented by an integer code that represents one of the categories. The use of these underlying codes gives pandas the ability to efficiently represent sets of categories and to perform ordering and comparisons of data across multiple categorical variables...