Let's quickly recap the problem we discussed in Chapter 2, MapReduce.
Note
Problem statement
Given access logs, you need to count the number of hits to your website per country. The input access logs will be in the following form:
Date, Requesting-IP-Address(remote host)
We are going to create a solution for this problem in Java to be executed over Hadoop 2.2.0. In Chapter 9, Hadoop Streaming and Advanced Hadoop Customizations, we will see how we can use Hadoop streaming to create mapper and reducer even in other languages such as Python and Ruby among others.
We will use Hadoop 2.2.0. It requires Java 7 or later versions of Java 6 (Oracle 1.6.0_31). It is recommended that you use Java 7 (preferably Oracle Java). You can refer to http://wiki.apache.org/hadoop/HadoopJavaVersions for more information on available JREs for Hadoop.
We like to use Eclipse as our preferred IDE, you may use any other IDE as per your choice. We also recommend you to...