Apache Mahout is a machine learning library that scales machine learning algorithms on Big Data. It is implemented on top of the Hadoop Big Data stack. It already implements a wide range of machine learning algorithms. In this recipe, we will outline steps to configure Apache Mahout.
Before we install Mahout, we need to make sure Hadoop has been properly installed.
Download Mahout from the mirror site with the following command on the master node:
wget http://www.eng.lsu.edu/mirrors/apache/mahout/0.7/mahout-distribution-0.7.tar.gz -P ~/repo
Use the following recipe to install Mahout:
Log in to the master node from the Hadoop administrator machine as
hduser
with the following command:ssh hduser@master
Copy the archive to
/usr/local
with the following command:sudo wget ftp://hadoop.admin/repo/mahout-distribution-0.7.tar.gz /usr/local
Decompress the Mahout archive with the following commands:
cd /usr/local sudo tar xvf mahout-distribution-0.7.tar...