This chapter describes how to perform advanced administration steps in your Hadoop cluster, how to develop unit and integration tests for Hadoop MapReduce programs and how to use the Java API of HDFS. This chapter assumes that you have followed the first chapter and have installed Hadoop in a clustered or pseudo-distributed setup.
Note
Sample code and data
The sample code files for this book are available in GitHub at https://github.com/thilg/hcb-v2. The chapter3
folder of the code repository contains the sample source code files for this chapter.
Sample codes can be compiled and built by issuing the gradle build
command in the chapter3
folder of the code repository. Project files for Eclipse IDE can be generated by running the gradle eclipse
command in the main folder of the code repository. Project files for the IntelliJ IDEA IDE can be generated by running the gradle idea
command in the main folder of the code repository.