Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Running Hadoop MapReduce v2 computations using Amazon Elastic MapReduce


Amazon Elastic MapReduce (EMR) provides on-demand managed Hadoop clusters in the Amazon Web Services (AWS) cloud to perform your Hadoop MapReduce computations. EMR uses Amazon Elastic Compute Cloud (EC2) instances as the compute resources. EMR supports reading input data from Amazon Simple Storage Service (S3) and storing of the output data in Amazon S3 as well. EMR takes care of the provisioning of cloud instances, configuring the Hadoop cluster, and the execution of our MapReduce computational flows.

In this recipe, we are going to execute the WordCount MapReduce sample (the Writing a WordCount MapReduce application, bundling it, and running it using the Hadoop local mode recipe from Chapter 1, Getting Started with Hadoop v2) in the Amazon EC2 using the Amazon Elastic MapReduce service.

Getting ready

Build the hcb-c1-samples.jar file by running the Gradle build in the chapter1 folder of the sample code repository.

How...