Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

pandas and Data Analysis

Welcome to Learning pandas! In this book, we will go on a journey that will see us learning pandas, an open source data analysis library for the Python programming language. The pandas library provides high-performance and easy-to-use data structures and analysis tools built with Python. pandas brings to Python many good things from the statistical programming language R, specifically data frame objects and R packages such as plyr and reshape2, and places them in a single library that you can use from within Python.

In this first chapter, we will take the time to understand pandas and how it fits into the bigger picture of data analysis. This will give the reader who is interested in pandas a feeling for its place in the bigger picture of data analysis instead of having a complete focus on the details of using pandas. The goal is that while learning pandas you also learn why those features exist in support of performing data analysis tasks.

So, let's jump in. In this chapter, we will cover:

  • What pandas is, why it was created, and what it gives you
  • How pandas relates to data analysis and data science
  • The processes involved in data analysis and how it is supported by pandas
  • General concepts of data and analytics
  • Basic concepts of data analysis and statistical analysis
  • Types of data and their applicability to pandas
  • Other libraries in the Python ecosystem that you will likely use with pandas