Book Image

IPython Notebook Essentials

By : Luiz Felipe Martins
Book Image

IPython Notebook Essentials

By: Luiz Felipe Martins

Overview of this book

Table of Contents (15 chapters)
IPython Notebook Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

An example with a realistic dataset


In this section, we will work with a realistic dataset of moderate size. We will use the World Development Indicators dataset, which is provided free of charge by the World Bank. This is a reasonably sized dataset that is not too large or complex to experiment with.

In any real application, we will need to read data from some source, reformat it to our purposes, and save the reformatted data back to some storage system. pandas offers facilities for data retrieval and storage in multiple formats:

  • Comma-separated values (CSV) in text files

  • Excel

  • JSON

  • SQL

  • HTML

  • Stata

  • Clipboard data in text format

  • Python-pickled data

The list of formats supported by pandas keeps growing with each new update to the library. Please refer to http://pandas.pydata.org/pandas-docs/stable/io.html for a current list.

Treating all formats supported by pandas is not possible in a book with the current scope. We will restrict examples to CSV files, which is a simple text format that is widely...