Book Image

Rapid - Apache Mahout Clustering designs

Book Image

Rapid - Apache Mahout Clustering designs

Overview of this book

Table of Contents (16 chapters)
Apache Mahout Clustering Designs
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Launching the Mahout job on the cluster


Mahout has a script under the bin folder of the installation. Notice line 120 onwards of the following script:

# CLASSPATH initially contains $MAHOUT_CONF_DIR, or defaults to $MAHOUT_HOME/src/conf
CLASSPATH=${CLASSPATH}:$MAHOUT_CONF_DIR

if [ "$MAHOUT_LOCAL" != "" ]; then
echo "MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath."
elif [ -n "$HADOOP_CONF_DIR"  ] ; then
echo "MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath."
  CLASSPATH=${CLASSPATH}:$HADOOP_CONF_DIR
fi

We can set HADOOP_HOME and HADOOP_CONF_DIR to launch the Mahout job (algorithm) on the Hadoop cluster.

Just before the algorithm command, set the two previously mentioned parameters using the export command:

export  HADOOP_HOME=<ur hadoop location>
export HADOOP_CONF_DIR=$HADOOP_HOME/conf

The Mahout launcher script helps to launch the job locally or on a cluster.