Spark is a fast and general execution engine for large-scale data processing. One of the prime uses of large-scale clusters is running data processing jobs. Spark provides data processing of many forms as part of the Berkeley Data Analytics Stack (BDAS). Spark supports batch processing, iterative processing, near real-time processing, and stream processing. In this chapter, we will walk you through the steps of setting up Spark on the Mesos cluster:
Introducing Spark
Spark job scheduling
Spark Standalone mode
Spark on Mesos
Tuning Spark on Mesos