Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
About the Authors
About the Reviewers

Running the examples

The source code of all examples is available at

Gradle ( scripts and configurations are provided to compile most of the Java code. The gradlew script included with the example will bootstrap Gradle and use it to fetch dependencies and compile code.

JAR files can be created by invoking the jar task via a gradlew script, as follows:

./gradlew jar

Jobs are usually executed by submitting a JAR file using the hadoop jar command, as follows:

$ hadoop jar example.jar <MainClass> [-libjars $LIBJARS] arg1 arg2 … argN

The optional -libjars parameter specifies runtime third-party dependencies to ship to remote nodes.


Some of the frameworks we will work with, such as Apache Spark, come with their own build and package management tools. Additional information and resources will be provided for these particular cases.

The copyJar Gradle task can be used to download third-party dependencies into build/libjars/<example>/lib, as follows:

./gradlew copyJar

For convenience, we provide a fatJar Gradle task that bundles the example classes and their dependencies into a single JAR file. Although this approach is discouraged in favor of using –libjar, it might come in handy when dealing with dependency issues.

The following command will generate build/libs/<example>-all.jar:

$ ./gradlew fatJar