Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Book Image

Hadoop MapReduce v2 Cookbook - Second Edition: RAW

Overview of this book

Table of Contents (19 chapters)
Hadoop MapReduce v2 Cookbook Second Edition
Credits
About the Author
Acknowledgments
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Getting started with Apache Mahout


Mahout is an effort to implement well-known machine learning and data mining algorithms using the Hadoop MapReduce framework. Users can use Mahout algorithm implementations in their data processing applications without going through the complexity of implementing these algorithms using Hadoop MapReduce from scratch.

This recipe explains how to get started with Mahout.

In order to install Mahout, we recommend you use one of the freely available commercial Hadoop distributions as described in Chapter 1, Getting Started with Hadoop v2. Another alternative is to use Apache Bigtop to install Mahout. Refer to the Bigtop-related recipe in Chapter 1, Getting Started with Hadoop v2 for steps on installing Mahout using the Apache Bigtop distribution.

How to do it...

This section demonstrates how to get started with Mahout by running a sample KMeans Clustering computation. You can run and verify the Mahout installation by carrying out the following steps:

  1. Download the...