Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Emitting data of different value types from a Mapper


Emitting data products belonging to multiple value types from a Mapper is useful when performing Reducer-side joins as well as when we need to avoid the complexity of having multiple MapReduce computations to summarize different types of properties in a dataset. However, Hadoop Reducers do not allow multiple input value types. In these scenarios, we can use the GenericWritable class to wrap multiple value instances belonging to different data types.

In this recipe, we reuse the HTTP server log entry analyzing the sample of the Implementing a custom Hadoop Writable data type recipe. However, instead of using a custom data type, in the current recipe, we output multiple value types from the Mapper. This sample aggregates the total number of bytes served from the web server to a particular host and also outputs a tab-separated list of URLs requested by the particular host. We use IntWritable to output the number of bytes from the Mapper and...