Book Image

OpenStack Sahara Essentials

By : Omar Khedher
Book Image

OpenStack Sahara Essentials

By: Omar Khedher

Overview of this book

The Sahara project is a module that aims to simplify the building of data processing capabilities on OpenStack. The goal of this book is to provide a focused, fast paced guide to installing, configuring, and getting started with integrating Hadoop with OpenStack, using Sahara. The book should explain to users how to deploy their data-intensive Hadoop and Spark clusters on top of OpenStack. It will also cover how to use the Sahara REST API, how to develop applications for Elastic Data Processing on Openstack, and setting up hadoop or spark clusters on Openstack.
Table of Contents (14 chapters)

Chapter 3. Using OpenStack Sahara

In the previous chapter, we have brought the OpenStack environment up and running. The Sahara project has been included and is ready to be used. In this chapter, we will start using Sahara to create a Hadoop cluster. Of course, running an Apache Hadoop cluster on top of OpenStack might require few planning considerations. This can include what type of instances will be assigned to the Hadoop cluster, image types, network setup, and storage backend. The following points will be highlighted:

  • Understanding node types in Sahara for a Hadoop cluster

  • Preparing an image for Hadoop nodes

  • Configuring a network for a Hadoop cluster

  • Creating and managing a Hadoop cluster using CLI

  • Creating and managing a Hadoop cluster using Horizon