Book Image

Mastering Python Scientific Computing

Book Image

Mastering Python Scientific Computing

Overview of this book

Table of Contents (17 chapters)
Mastering Python Scientific Computing
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

I/O operations


The pandas I/O API is a bundle of reader functions that returns a pandas object. It is very easy to load data using the tools bundled in pandas. Data is loaded into the pandas data structures from records in various types of files, such as comma-separated values (CSV), Excel, HDF, SQL, JSON, HTML, Google Big Query, pickle, stats format, and the clipboard. There are several reader functions—one function for each type of file—namely read_csv, read_excel, read_hdf, read_sql, read_json, read_html, read_stata, read_clipboard, and read_pickle. After loading, the data is prepared for analyzing. This involves deletion of erroneous entries, normalization, grouping, transformation, and sorting.

Working on CSV files

The next program demonstrates working on CSV files and performing various operations on it. This program uses Book-Crossing datasets in CSV format, downloaded from http://www2.informatik.uni-freiburg.de/~cziegler/BX/. It contains three CSV files (BX-Books.csv, BX-Users.csv...