Here we explain in detail how to run Apache Spark on Mesos.
We have two options:
- We need the Spark binary uploaded to a place accessible by Mesos and Spark driver configured to connect to Mesos
- Install Spark in the same location in all the Mesos slaves and set
Spark.Mesos.executor.home
to point to this specific location
Follow these steps for the first option:
The first time that we run a Mesos task on the Mesos slave, this slave must have the Spark binary package to run the Executor in backend. The location accessible by Mesos could be HDFS, HTTP, or S3.
At the time of writing this book, Spark's version is 2.0.0. To download and upload it on HDFS we use the following command:
$ wget http://apache.mirrors.ionfish.org/spark/spark-2.0.0/spark- 2.0.0-bin-hadoop2.6.tgz $ hadoop fs -put spark-2.0.0-bin-hadoop2.7.tgz /
In the Spark driver program, give the master URL as the Apache Mesos master URL in the form:
- For a single Mesos master cluster:
Mesos://master-host:5050...