Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Running MapReduce jobs on HBase


This recipe explains how to run a MapReduce job that reads and writes data directly to and from HBase storage.

HBase provides abstract mapper and reducer implementations that users can extend to read and write directly from HBase. This recipe explains how to write a sample MapReduce application using these mappers and reducers.

We will use the World Bank's Human Development Report (HDR) data, by country, which shows Gross National Income (GNI) per capita of each country. The dataset can be found at http://hdr.undp.org/en/statistics/data/. A sample of this dataset is available in the chapter7/resources/hdi-data.csv file in the sample source code repository. Using MapReduce, we will calculate average value of GNI per capita, by country.

Getting ready

This recipe requires an Apache HBase installation integrated with a Hadoop YARN cluster. Make sure to start all the configured HBase Master and RegionServer processes before we begin.

How to do it...

This section demonstrates...