Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Writing multiple outputs from a MapReduce computation


We can use the MultipleOutputs feature of Hadoop to emit multiple outputs from a MapReduce computation. This feature is useful when we want to write different outputs to different files and also when we need to output an additional output in addition to the main output of a job. The MultipleOutputs feature allows us to specify a different OutputFormat for each output as well.

How to do it...

The following steps show you how to use the MultipleOutputs feature to output two different datasets from a Hadoop MapReduce computation:

  1. Configure and name the multiple outputs using the Hadoop driver program:

    Job job = Job.getInstance(getConf(), "log-analysis");
    …
    FileOutputFormat.setOutputPath(job, new Path(outputPath));
    MultipleOutputs.addNamedOutput(job, "responsesizes", TextOutputFormat.class,Text.class, IntWritable.class);
    MultipleOutputs.addNamedOutput(job, "timestamps", TextOutputFormat.class,Text.class, Text.class);
  2. Write data to the different...