Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Data Aggregation

Data aggregation is the process of grouping data based on some meaningful categories of the information. Analysis is then performed on each of the groups to report one or more summary statistics for each. This summarization in this sense is a general term in that summarization can literally be a summation (such as total number of units sold) or statistical calculation such as a mean or standard deviation.

This chapter will examine the facilities of pandas to perform data aggregation. This includes a powerful split-apply-combine pattern for grouping, performing group-level transformations and analyses, and reporting the results from every group within a summary pandas object. Within this framework, we will examine several techniques of grouping data, applying functions on a group level, and being able to filter data in or out of the analysis.

Specifically, in this...