Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
About the Author
About the Reviewers

Amazon Elastic MapReduce (EMR)

Amazon AWS offers Hadoop as a PaaS. Organizations and individuals can provision Hadoop clusters on the fly, run their workloads, and download results. Provisioning a Hadoop cluster using EMR takes a few minutes and is a few clicks away.


Usage of any Amazon Web Service requires an Amazon account. Visit and register for a free account. A credit card is mandatory to register for an Amazon account. However, it will be charged only on usage beyond the free tier offered by Amazon. The registered e-mail is subsequently used as the username.

The general steps to create and run workloads on EMR are as follows:

  1. The application is developed locally in Java using Hadoop's MapReduce APIs, Hive, Pig, or a language of the user's choice. Non-Java-based languages can be executed in a Hadoop cluster using Hadoop Streaming. A developer guide can be found at

  2. The application...