Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing a k-Nearest Neighbors classifier


One simple way to classify an item is to look at only its neighboring data. The k-Nearest Neighbors algorithm looks at k items located closest to the item in question. The item is then classified as the most common classification of its k neighbors. This heuristic has been very promising for a wide variety of classification tasks.

In this recipe, we will implement the k-Nearest Neighbors algorithm using a k-d tree data structure, which is a binary tree with special properties that allow efficient representation of points in a k-dimensional space.

Imagine we have a web server for our hip new website. Every time someone requests a web page, our web server will fetch the file and present the page. However, bots can easily hammer a web server with thousands of requests, potentially causing a denial of service attack. In this recipe, we will classify whether a web request is being made by a human or a bot.

Getting ready

Install the KdTree, CSV, and iproute...