One of the fundamental things when dealing with any dataset is to get intimate with it: without understanding what you are dealing with, you cannot build a successful statistical model.
To execute this recipe, you will need pandas
, Statsmodels
, and Matplotlib
. No other prerequisites are required.
One of the fundamental statistics to check for any time series is the autocorrelation function (ACF), partial autocorrelation function (PACF), and spectral density (the ts_timeSeriesFunctions.py
file):
import statsmodels as sm # read the data riverFlows = pd.read_csv(data_folder + 'combined_flow.csv', index_col=0, parse_dates=[0]) # autocorrelation function acf = {} # to store the results for col in riverFlows.columns: acf[col] = sm.tsa.stattools.acf(riverFlows[col]) # partial autocorrelation function pacf = {} for col in riverFlows.columns: pacf[col] = sm.tsa.stattools.pacf(riverFlows[col]) # periodogram (spectral density...