Book Image

Clojure Data Analysis Cookbook - Second Edition

By : Eric Richard Rochester
Book Image

Clojure Data Analysis Cookbook - Second Edition

By: Eric Richard Rochester

Overview of this book

Table of Contents (19 chapters)
Clojure Data Analysis Cookbook Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Classifying data with the Naive Bayesian classifier


Bayesian classification is a way of updating your estimate of the probability that an item is in a given category, depending on what you already know about that item, category, and the world at large. In the case of a Naive Bayesian system, we assume that all features of the items are independent. For example, elevation and average snowfall are not independent (higher elevations tend to have more snow), but elevation and median income should be independent. This algorithm has been useful in a number of interesting areas, for example, spam detection in emails, automatic language detection, and document classification. In this recipe, we'll apply it to the mushroom dataset that we looked at in the Classifying data with decision trees recipe.

Getting ready

First, we'll need to use the dependencies that we specified in the project.clj file in the Loading CSV and ARFF files into Weka recipe. We'll also use the defanalysis macro from the Discovering...