The driver class is the one which has the main method and provides the place where the Hadoop job is created and its mapper and reducer along with a bunch of other configurations and settings are declared. The job is initiated from here itself.
We will name this class HitsByCountry
and let's create this class inside the learning.bigdata.main
package. Your driver class should have the following signature:
public class HitsByCountry extends Configured implements Tool { // Here we will have the main method as well as the overridden implementation of run method }
The driver class extends the Configured
class and implements the Tool
interface. There are many Hadoop configurations you can set in the driver class while creating a job. For example, you can set the number of reducers using the mapred.reduce.tasks
configuration and you can set the separator between the key and value you will have in your reducer output while using TextOutputFormat
with mapred.textoutputformat...