Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Accessing Data

In almost any real-world data analysis, you need to load data from outside your program. Since pandas is built on Python, you can use any means available in Python to retrieve data. This makes it possible to access data from an almost unlimited set of sources, including but not limited to files, Excel spreadsheets, websites and services, databases, and cloud services.

However, when using standard Python functions to load data, you need to convert Python objects into pandas Series or DataFrame objects. This increases the complexity of your code. To help with managing this complexity, pandas offers a number of facilities to load data from various sources directly into pandas objects. We will examine many of these in this chapter.

Specifically, in this chapter, we will cover:

  • Reading a CSV file into a DataFrame
  • Specifying the index column when reading a CSV file
  • Data...