Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing a decision tree classifier


A decision tree is a model for classifying data effectively. Each child of a node in the tree represents a feature about the item we are classifying. Traversing down the tree to leaf nodes represent an item's classification. It's often desirable to create the smallest possible tree to represent a large sample of data.

In this recipe, we implement the ID3 decision tree algorithm in Haskell. It is one of the easiest to implement and produces useful results. However, ID3 does not guarantee an optimal solution, may be computationally inefficient compared to other algorithms, and only supports discrete data. While these issues can be addressed by a more complicated algorithm such as C4.5, the code in this recipe is enough to get up and running with a working decision tree.

Getting ready

Create a CSV file representing samples of data. The last column should be the classification. Name this file input.csv.

The weather data is represented with four attributes...