-
Book Overview & Buying
-
Table Of Contents
Apache Mahout Essentials
By :
In this section, we will have a quick look at Apache Mahout.
Do you know how Mahout got its name?

As you can see in the logo, a mahout is a person who drives an elephant. Hadoop's logo is an elephant. So, this is an indicator that Mahout's goal is to use Hadoop in the right manner.
The following are the features of Mahout:
For those of you who are curious! What are the problems that Mahout is trying to solve? The following problems that Mahout is trying to solve:
The amount of available data is growing drastically.
The computer hardware market is geared toward providing better performance in computers. Machine learning algorithms are computationally expensive algorithms. However, there was no framework sufficient to harness the power of hardware (multicore computers) to gain better performance.
The need for a parallel programming framework to speed up machine learning algorithms.
Mahout is a general parallelization for machine learning algorithms (the parallelization method is not algorithm-specific).
No specialized optimizations are required to improve the performance of each algorithm; you just need to add some more cores.
Linear speed up with number of cores.
Each algorithm, such as Naïve Bayes, K-Means, and Expectation-maximization, is expressed in the summation form. (I will explain this in detail in future chapters.)
For more information, please read Map-Reduce for Machine Learning on Multicore, which can be found at http://www.cs.stanford.edu/people/ang/papers/nips06-mapreducemulticore.pdf.
Download the latest release of Mahout from https://mahout.apache.org/general/downloads.html.
If you are referencing Mahout as a Maven project, add the following dependency in the pom.xml file:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-core</artifactId>
<version>${mahout.version}</version>
</dependency>If required, add the following Maven dependencies as well:
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-math</artifactId>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-integration</artifactId>
<version>${mahout.version}</version>
</dependency>Downloading the example code
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
More details on setting up a Maven project can be found at http://maven.apache.org/.
Follow the instructions given at https://mahout.apache.org/developers/buildingmahout.html to build Mahout from the source.
The Mahout command-line launcher is located at bin/mahout.
Change the font size
Change margin width
Change background colour