-
Book Overview & Buying
-
Table Of Contents
Programming MapReduce with Scalding
By :
Deploying an application requires using a build tool to package our application into a jar file and copying it to a client node of the Hadoop cluster. The process of execution is straightforward and is very similar to submitting any JAR file for execution on a Hadoop cluster, as shown in the following command:
$ hadoop jar myjar.jar com.twitter.scalding.Tool mypackage.MyJob–-hdfs –-input /data/set1/ --output /output/res1/
The submitted job has the same permissions in HDFS as the user that submitted the job. If the read and write permissions are satisfied, it will process the input and store the resulting data.
Scalding applications, when storing in HDFS, write data to the output folder defined in a sink in our job. Any existing content on that folder is purged every time a job begins its execution.
Internally, the JAR file is submitted to the JobTracker service that orchestrates the execution of the map and reduce phases. The actual tasks are executed...
Change the font size
Change margin width
Change background colour