In this chapter, we have covered details of Spark Streaming, and have spent most of the time explaining the constructs of discretized streams, and have also explained the new and upcoming Structured Streaming API. As mentioned, the Structured Streaming API is still in alpha mode, and hence should not be used for production applications.
The next chapter deals with one of my favorite topics - Spark MLLib. Spark provides a rich API for predictive modeling and the use of Spark MLLib is increasing every day. We'll look at the basics of machine learning before providing users with an insight into how the Spark framework provide support for performing predictive analytics. We'll cover topics from building a machine-learning pipeline, feature-engineering, classification and regression, clustering, and a few advanced topics including identifying the champion models and tuning a model for performance.