Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Apache Mahout Essentials
  • Table Of Contents Toc
Apache Mahout Essentials

Apache Mahout Essentials

By : Jayani Withanawasam
3.7 (3)
close
close
Apache Mahout Essentials

Apache Mahout Essentials

3.7 (3)
By: Jayani Withanawasam

Overview of this book

If you are a Java developer or data scientist, haven't worked with Apache Mahout before, and want to get up to speed on implementing machine learning on big data, then this is the perfect guide for you.
Table of Contents (8 chapters)
close
close
7
Index

Machine learning libraries

Machine learning libraries can be categorized using different criteria, which are explained in the sections that follow.

Open source or commercial

Free and open source libraries are cost-effective solutions, and most of them provide a framework that allows you to implement new algorithms on your own. However, support for these libraries is not as good as the support available for proprietary libraries. However, some open source libraries have very active mailing lists to address this issue.

Apache Mahout, OpenCV, MLib, and Mallet are some open source libraries.

MATLAB is a commercial numerical environment that contains a machine learning library.

Scalability

Machine learning algorithms are resource-intensive (CPU, memory, and storage) operations. Also, most of the time, they are applied on large volumes of datasets. So, decentralization (for example, data and algorithms), distribution, and replication techniques are used to scale out a system:

  • Apache Mahout (data distributed over clusters and parallel algorithms)
  • Spark MLib (distributed memory-based Spark architecture)
  • MLPACK (low memory or CPU requirements due to the use of C++)
  • GraphLab (multicore parallelism)

Languages used

Most of the machine learning libraries are implemented using languages such as Java, C#, C++, Python, and Scala.

Algorithm support

Machine learning libraries, such as R and Weka, have many machine learning algorithms implemented. However, they are not scalable. So, when it comes to scalable machine learning libraries, Apache Mahout has better algorithm support than Spark MLib at the moment, as Spark MLib is relatively young.

Batch processing versus stream processing

Stream processing mechanisms, for example, Jubatus and Samoa, update a model instantaneously just after receiving data using incremental learning.

In batch processing, data is collected over a period of time and then processed together. In the context of machine learning, the model is updated after collecting data for a period of time. The batch processing mechanism (for example, Apache Mahout) is mostly suitable for processing large volumes of data.

LIBSVM implements support vector machines and it is specialized for that purpose.

A comparison of some of the popular machine learning libraries is given in the following table Table 1: Comparison between popular machine learning libraries:

Machine learning library

Open source or commercial

Scalable?

Language used

Algorithm support

MATLAB

Commercial

No

Mostly C

High

R packages

Open source

No

R

High

Weka

Open source

No

Java

High

Sci-Kit Learn

Open source

No

Python

 

Apache Mahout

Open source

Yes

Java

Medium

Spark MLib

Open source

Yes

Scala

Low

Samoa

Open source

Yes

Java

 
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Apache Mahout Essentials
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon