In this chapter, we will cover the following recipes:
Choosing appropriate Hadoop data types
Implementing a custom Hadoop Writable data type
Implementing a custom Hadoop key type
Emitting data of different value types from a Mapper
Choosing a suitable Hadoop InputFormat for your input data format
Adding support for new input data formats – implementing a custom InputFormat
Formatting the results of MapReduce computations – using Hadoop OutputFormats
Writing multiple outputs from a MapReduce computation
Hadoop intermediate data partitioning
Secondary sorting – sorting Reduce input values
Broadcasting and distributing shared resources to tasks in a MapReduce job – Hadoop DistributedCache
Using Hadoop with legacy applications – Hadoop streaming
Adding dependencies between MapReduce jobs
Hadoop counters for reporting custom metrics