Data mining is a term that is been around since the 1990s. What exactly is data mining? Data mining is the process of working with a large amount of data to gather insights and detect patterns. Analysts often use it when the data does not include a response variable, yet there is a belief that a relationship or information about the structure of the data lies within it. This chapter will cover the following three introductory topics of data mining:
Explaining cluster analysis
Partitioning using k-means clustering
Clustering using hierarchical techniques
As in the previous chapters, you learned through use cases. There are two different use cases provided that teach data mining with two different cluster analysis approaches. Before we begin working, it is worthwhile to understand cluster analysis in context.