A pandas
DataFrame
is a matrix and dictionary-like data structure similar to the functionality available in R. In fact, it is the central data structure in pandas, and you can apply all kinds of operations on it. It is quite common to take a look, for instance, at the correlation matrix of a portfolio, so let's do that.
First, we will create the DataFrame
with pandas for each symbol's daily log returns. Then we will join these on the date. At the end, the correlation will be printed and a plot will appear:
To create the data frame, create a dictionary containing stock symbols as keys and the corresponding log returns as values. The data frame itself has the date as the index and the stock symbols as column labels:
data = {} for i, symbol in enumerate(symbols): data[symbol] = np.diff(np.log(close[i])) # Convention: import pandas as pd df = pd.DataFrame(data, index=dates[0][:-1], columns=symbols)
We can now perform operations...