Counters are entities that can collect statistics at a job level. They can help in quality control, performance monitoring, and problem identification in Hadoop MapReduce jobs. Since they are global in nature, unlike logs, they need not be aggregated to be analyzed. Counters are grouped into logical groups using the CounterGroup
class. There are sets of built-in counters for each MapReduce job.
The following example illustrates the creation of simple custom counters to categorize lines into lines having zero words, lines with less than or equal to five words, and lines with more than five words. The program when run on the grant proposal subset files gives the following output:
14/04/13 23:27:00 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=446021466 FILE: Number of bytes written=114627807 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations...