Computing the Manhattan distance
Defining a distance between two items allows us to easily interpret clusters and patterns. The Manhattan distance is one of the easiest to implement and is used primarily due to its simplicity.
The Manhattan distance (or Taxicab distance) between two items is the sum of the absolute differences of their coordinates. So if we are given two points (1, 1) and (5, 4), then the Manhattan distance will be |1-5| + |1-4| = 4 + 3 = 7.
We can use this distance metric to detect whether an item is unusually far away from everything else. In this recipe, we will detect outliers using the Manhattan distance. The calculations merely involve addition and subtraction, and therefore, it performs exceptionally well for a very large amount of data.
Getting ready
Create a list of comma-separated points. We will compute the smallest distance between these points and a test point:
$ cat input.csv 0,0 10,0 0,10 10,10 5,5
How to do it...
Create a new file, which we will call Main.hs
...