Book Image

Learning Pandas

By : Michael Heydt
Book Image

Learning Pandas

By: Michael Heydt

Overview of this book

Table of Contents (19 chapters)
Learning pandas
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 1. A Tour of pandas

In this chapter, we will take a look at pandas, which is an open source Python-based data analysis library. It provides high-performance and easy-to-use data structures and data analysis tools built with the Python programming language. The pandas library brings many of the good things from R, specifically the DataFrame objects and R packages such as plyr and reshape2, and places them in a single library that you can use in your Python applications.

The development of pandas was begun in 2008 by Wes McKinney when he worked at AQR Capital Management. It was opened sourced in 2009 and is currently supported and actively developed by various organizations and contributors. It was initially designed with finance in mind, specifically with its ability around time series data manipulation, but emphasizes the data manipulation part of the equation leaving statistical, financial, and other types of analyses to other Python libraries.

In this chapter, we will take a brief tour of pandas and some of the associated tools such as IPython notebooks. You will be introduced to a variety of concepts in pandas for data organization and manipulation in an effort to form both a base understanding and a frame of reference for deeper coverage in later sections of this book. By the end of this chapter, you will have a good understanding of the fundamentals of pandas and even be able to perform basic data manipulations. Also, you will be ready to continue with later portions of this book for more detailed understanding.

This chapter will introduce you to:

  • pandas and why it is important

  • IPython and IPython Notebooks

  • Referencing pandas in your application

  • The Series and DataFrame objects of pandas

  • How to load data from files and the Web

  • The simplicity of visualizing pandas data

Note

pandas is always lowercase by convention in pandas documentation, and this will be a convention followed by this book.