Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing the k-means clustering algorithm


The k-means clustering algorithm partitions data into k different groups. These k groupings are called clusters, and the location of these clusters are adjusted iteratively. We compute the arithmetic mean of all the points in a group to obtain a centroid point that we use, replacing the previous cluster location.

Hopefully, after this succinct explanation, the name k-means clustering no longer sounds completely foreign. One of the best places to learn more about this algorithm is on Coursera: https://class.coursera.org/ml-003/lecture/78.

How to do it…

Create a new file, which we call Main.hs, and perform the following steps:

  1. Import the following built-in libraries:

    import Data.Map (Map)
    import qualified Data.Map as Map
    import Data.List (minimumBy, sort, transpose)
    import Data.Ord (comparing)
  2. Define a type synonym for points shown as follows:

    type Point = [Double] 
  3. Define the Euclidian distance function between two points:

    dist :: Point -> Point -...