We can use the MultipleOutputs
feature of Hadoop to emit multiple outputs from a MapReduce computation. This feature is useful when we want to write different outputs to different files and also when we need to output an additional output in addition to the main output of a job. The MultipleOutputs
feature allows us to specify a different OutputFormat for each output as well.
The following steps show you how to use the MultipleOutputs
feature to output two different datasets from a Hadoop MapReduce computation:
Configure and name the multiple outputs using the Hadoop driver program:
Job job = Job.getInstance(getConf(), "log-analysis"); … FileOutputFormat.setOutputPath(job, new Path(outputPath)); MultipleOutputs.addNamedOutput(job, "responsesizes", TextOutputFormat.class,Text.class, IntWritable.class); MultipleOutputs.addNamedOutput(job, "timestamps", TextOutputFormat.class,Text.class, Text.class);