In this chapter, we learned the key points of Apache Spark from scratch. We saw how to download, install, and test Apache Spark. We also saw how to run Spark applications; and we reviewed some Spark core concepts, such as RDD, and the RDD operations (transformations and actions).
Also, we saw how to run Apache Spark in cluster mode, how to run the driver program, and how to achieve high-availability.
Finally, we dived into Spark streaming, the stateless and stateful transformations, the output operations, how to enable it 24/7, and how to improve Spark streaming performance.
In the following chapters, we will see how Apache Spark is the glue to our stack. In each chapter we will see the relationship with this technology.