In order to use Hadoop to run our MapReduce applications, there are key terminologies used in the technology we should understand.
These are briefly described as follows:
NameNode: The NameNode is responsible for keeping the directory tree of all the files stored in the system and tracking where the file data is stored in the cluster.
JobTracker: The JobTracker passes out MapReduce tasks to each Raspberry Pi in our cluster.
DataNode: The DataNodes in our cluster use the HDFS to store replicated data.
TaskTracker: A node in our Raspberry Pi cluster that accepts tasks.
Default configuration: A default configuration file provides the base that the website-specific files overwrite/augment. An example is the core-default.xml file.
Site-specific configuration: You will be familiar with this from the previous chapter. This configuration contains specifics about our own development environment, such as our Raspberry Pi's IP address.