Computing the Euclidean distance
Defining a distance between two items allows us to easily interpret clusters and patterns. The Euclidean distance is one of the most geometrically natural forms of distance to implement. It uses the Pythagorean formula to compute how far away two items are, which is similar to measuring the distance with a physical ruler.
We can use this distance metric to detect whether an item is unusually far away from everything else. In this recipe, we will detect outliers using the Euclidean distance. It is slightly more computationally expensive than measuring the Manhattan distance since it involves multiplication and square roots; however, depending on the dataset, it may provide more accurate results.
Getting ready
Create a list of comma-separated points. We will compute the smallest distance between these points and a test point.
$ cat input.csv 0,0 10,0 0,10 10,10 5,5
How to do it...
Create a new file, which we will call Main.hs
, and perform the following steps:
Import...