Book Image

Mastering Clojure Data Analysis

By : Eric Richard Rochester
Book Image

Mastering Clojure Data Analysis

By: Eric Richard Rochester

Overview of this book

Table of Contents (17 chapters)
Mastering Clojure Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using the Weka machine learning library


We're going to test a couple of machine learning algorithms that are commonly used for sentiment analysis. Some of them are implemented in the OpenNLP library. However, they do not have anything for others algorithms. So instead, we'll use the Weka machine learning library (http://www.cs.waikato.ac.nz/ml/weka/). This doesn't have the classes to tokenize or segment the data that an application in a natural language processing requires, but it does have a more complete palette of machine learning algorithms.

All of the classes in the Weka library also have a standard, consistent interface. These classes are really designed to be used from the command line, so each takes its options as an array of strings with a command-line-like syntax. For example, the array for a naive Bayesian classifier may have a flag to indicate that it should use the kernel density estimator rather than the normal distribution. This would be indicated by the -K flag being included...