Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Summary

In this chapter, we went on a tour of the how and why of pandas, data manipulation/analysis, and science. This started with an overview of why pandas exists, what functionality it contains, and how it relates to concepts of data manipulation, analysis, and data science.

Then we covered a process for data analysis to set a framework for why certain functions exist in pandas. These include retrieving data, organizing and cleaning it up, doing exploration, and then building a formal model, presenting your findings, and being able to share and reproduce the analysis.

Next, we covered several concepts involved in data and statistical modeling. This included covering many common analysis techniques and concepts, so as to introduce you to these and make you more familiar when they are explored in more detail in subsequent chapters.

pandas is also a part of a larger Python ecosystem of libraries that are useful for data analysis and science. While this book will focus only on pandas, there are other libraries that you will come across and that were introduced so you are familiar with them when they crop up.

We are ready to begin using pandas. In the next chapter, we will begin to ease ourselves into pandas, starting with obtaining a Python and pandas environment, an overview of Jupyter notebooks, and then getting a quick introduction to pandas Series and DataFrame objects before delving into them im more depth in subsequent elements of pandas.