Computing the Jaro-Winkler distance between two strings
The Jaro-Winkler distance measures string similarity represented as a real number between 0 and 1. The value 0 corresponds to no similarity, and 1 corresponds to an identical match.
Getting ready
The algorithm behind the function comes from the following mathematical formula presented in the Wikipedia article about the Jaro-Winkler distance http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance:
In the preceding formula, the following are the representations of the variables used:
s1 is the first string.
s2 is the second string.
m is the number of identical characters within a distance of at the most half the length of the longer string. These are called matching characters.
t is half the number of matching characters that are not in the same index. In other words, it is half the number of transpositions.
How to do it...
We will need access to the
elemIndices
function, which is imported as follows:import Data.List (elemIndices)
Define...