The OpenStack Data processing service is used by users for setting up clusters for data processing. Some of the examples are Hadoop and Spark. Users need to specify the configuration for the clusters, namely, version, topology, and nodes. With this information, the Data Processing service will deploy the cluster in the cloud. This cluster is scalable and users can add/remove nodes on demand.
We will now discuss the procedure to install the Data Processing service known as sahara
on the controller node. The steps are as follows:
Install the package for the Data Processing service:
apt-get install python-pip pip install sahara
Make the following changes to the
/etc/sahara/sahara.conf
configuration file:Go to the
[database]
section and set the parameter connection to point it to a database:connection = mysql://sahara:SAHARA_DBPASS@controller/sahara
Next, in the
[keystone_authtoken]
section, set theauth_uri
andidentity_uri
parameters...