Clustering is a type of machine learning algorithm that aims to group items based on similarities. In this example, we will use the log returns of stocks in the Dow Jones Industrial Average (DJI or DJIA) index to cluster. Most of the steps of this recipe have already passed the review in previous chapters.
First, we will download the EOD price data for those stocks from Yahoo! Finance. Then, we will calculate a square affinity matrix. Finally, we will cluster the stocks with the AffinityPropagation
class:
Download price data for 2011 using the stock symbols of the DJI Index. In this example, we are only interested in the close price:
# 2011 to 2012 start = datetime.datetime(2011, 01, 01) end = datetime.datetime(2012, 01, 01) #Dow Jones symbols symbols = ["AA", "AXP", "BA", "BAC", "CAT", "CSCO", "CVX", "DD", "DIS", "GE", "HD", "HPQ", "IBM", "INTC", "JNJ", "JPM", "KO", "MCD", "MMM", "MRK", "MSFT", "PFE", "PG", "T",...