In this recipe, we are going to learn how to write a map reduce output to multiple output files. This will be useful when we need to use classified output for different purposes.
To perform this recipe, you should have a running Hadoop cluster as well as an eclipse similar to an IDE.
Hadoop supports a class called MultipleOutputs
, which allows us to write out of a map reduce program to multiple files. We can write output to different files, file types, and different locations with it. You can also choose the filename with this API. To use this, we will take a look at a simple word count program and write out of this program to multiple output files.
To do so, we need to add the named output files and their types in our Driver code, as shown here:
public class WordCount { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); if (args.length != 2) {...