In the preceding chapter, we looked at how a Hadoop cluster can be created in OpenStack by means of templates. In this chapter, we will use the cluster by running and executing Hadoop jobs efficiently. Keep in mind that running jobs in Sahara depends on the choice of Elastic Data Processing (EDP) provisioning plugin discussed in the previous chapter. Thus, this chapter will guide you through the following points:
Understanding the essential components to run an EDP job in Sahara
Discussing the data source workflow in Sahara
Configuring a job in Sahara
Gathering the pieces together by executing a job in Sahara using Horizon
Enhancing EDP in Sahara using REST APIs
Executing a Spark job using the Sahara REST API