Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Amazon Elastic MapReduce (EMR)


Amazon AWS offers Hadoop as a PaaS. Organizations and individuals can provision Hadoop clusters on the fly, run their workloads, and download results. Provisioning a Hadoop cluster using EMR takes a few minutes and is a few clicks away.

Note

Usage of any Amazon Web Service requires an Amazon account. Visit http://aws.amazon.com and register for a free account. A credit card is mandatory to register for an Amazon account. However, it will be charged only on usage beyond the free tier offered by Amazon. The registered e-mail is subsequently used as the username.

The general steps to create and run workloads on EMR are as follows:

  1. The application is developed locally in Java using Hadoop's MapReduce APIs, Hive, Pig, or a language of the user's choice. Non-Java-based languages can be executed in a Hadoop cluster using Hadoop Streaming. A developer guide can be found at http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html.

  2. The application...