In the previous chapters, we discussed model-based clustering algorithms. In this chapter, we will discuss one more new algorithm that is implemented in Apache Mahout – Streaming K-means. The algorithms, which we used to build models, generally find patterns in the data and use this learned pattern to predict from the incoming data. In this scenario, we know that patterns in data are static, but the main difficulty occurs when the patterns in data are dynamic. In the scenario where patterns in data are dynamic, streaming algorithms comes to the rescue. We will discuss the following topics in this chapter:
Learning Streaming K-means
Using Mahout to run Streaming K-means