Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Relating the book to the process

The following gives a quick mapping of the steps in the process to where you will learn about them in this book. Do not fret if the steps that are earlier in the process are in later chapters. The book will walk you through this in a logical progression for learning pandas, and you can refer back from the chapters to the relevant stage in the process.

Step in process

Place

Ideation

Ideation is the creative process in data science. You need to have the idea. The fact that you are reading this qualifies you as you must be looking to analyze some data, and want to in the future.

Retrieval

Retrieval of data is primarily covered in Chapter 9, Accessing Data.

Preparation

Preparation of data is primarily covered in Chapter 10, Tidying Up your Data, but it is also a common thread running through most of the chapters.

Exploration

Exploration spans Chapter 3, Representing Univariate Data with the Series, through Chapter 15, Historical Stock Price Analysis, so most of the chapters of the book. But the most focused chapters for exploration are Chapter 14, Visualization and Chapter 15, Historical Stock Price Analysis, in both of which we begin to see the results of data analysis.

Modeling

Modeling has its focus in Chapter 3, Representing Univariate Data with the pandas Series, and Chapter 4, Representing Tabular and Multivariate Data with the DataFrame with the pandas DataFrame, and also Chapter 11, Combining, Relating, and Reshaping Data through Chapter 13, Time-Series Modelling, and with a specific focus towards finance in Chapter 15, Historical Stock Price Analysis.

Presentation

Presentation is the primary purpose of Chapter 14, Visualization.

Reproduction

Reproduction flows throughout the book, as the examples are provided as Jupyter notebooks. By working in notebooks, you are by default using a tool for reproduction and have the ability to share notebooks in various ways.