Often the output of your MapReduce computation will be consumed by other applications. Hence, it is important to store the result of a MapReduce computation in a format that can be consumed efficiently by the target application. It is also important to store and organize the data in a location that is efficiently accessible by your target application. We can use Hadoop OutputFormat interface to define the data storage format, data storage location, and the organization of the output data of a MapReduce computation. An OutputFormat prepares the output location and provides a RecordWriter
implementation to perform the actual serialization and storage of data.
Hadoop uses the org.apache.hadoop.mapreduce.lib.output.TextOutputFormat<K,V>
abstract class as the default OutputFormat for the MapReduce computations. TextOutputFormat
writes the records of the output data to plain text files in HDFS using a separate line...