Book Image

Python Data Analysis

By : Ivan Idris
Book Image

Python Data Analysis

By: Ivan Idris

Overview of this book

Table of Contents (22 chapters)
Python Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Key Concepts
Online Resources
Index

Autocorrelation


Autocorrelation is correlation within a dataset and can indicate a trend.

Note

For a given time series, with known mean and standard deviations, we can define the autocorrelation for times s and t using the expected value operator as follows:

This is, in essence, the formula for correlation applied to a time series and the same time series lagged.

For example, if we have a lag of one period, we can check if the previous value influences the current value. For that to be true, the autocorrelation value has to be pretty high.

In the previous chapter, Chapter 6, Data Visualization, we already used a pandas function that plots autocorrelation. In this example, we will use the NumPy correlate() function to calculate the actual autocorrelation values for the sunspots cycle. At the end, we need to normalize the values we receive. Apply the NumPy correlate() function as follows:

y = data - np.mean(data)
norm = np.sum(y ** 2)
correlated = np.correlate(y, y, mode='full')/norm

We are also...