Using a high-performance hash table
Haskell already comes with a
Data.Map
module based on size-balanced binary trees. There exist better-optimized hash table libraries such as Data.HashMap
from the unordered-containers package.
For example, both Data.Map
and Data.HashMap
have insertion and lookup time complexities of O(log n); however, the latter uses a large base, so in practice these operations are constant time. More documentation on Data.HashMap
can be found at http://hackage.haskell.org/package/unordered-containers-0.2.4.0/docs/Data-HashMap-Lazy.html.
In this recipe, we will use the unordered-contains library from Hackage to create a mapping of word size to a set of words of that size.
Getting ready
Download a large corpus of text and name the file big.txt
as follows:
$ wget norvig.com/big.txt
Install the Data.HashMap
package using Cabal as follows:
$ cabal install unordered-containers
How to do it…
Import the
HashMap
package as follows:import Data.HashMap.Lazy import Data.Set (Set) import...