Book Image

Learning pandas - Second Edition

By : Michael Heydt
Book Image

Learning pandas - Second Edition

By: Michael Heydt

Overview of this book

You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance. With the knowledge you gain from this book, you will quickly learn pandas and how it can empower you in the exciting world of data manipulation, analysis and science.
Table of Contents (16 chapters)

Correlation of stocks based on the daily percentage change of the closing price

Correlation is a measure of the strength of the association between two variables. A correlation coefficient of 1.0 means that every change in value in one set of data has a proportionate change in value in the other set of data. A 0.0 correlation means that the data sets have no relationship. The higher the correlation, the more ability there is to predict a change in each, based on one or the other.

The scatter plot matrix gave us a quick visual idea of the correlation between two stocks, but it was not an exact number. The exact correlation between the columns of data in DataFrame can be calculated using the .corr() method. This will produce a matrix of all possible correlations between the variables represented the columns.

The following example calculates the correlation in the daily percentage...