Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Calculating a moving median


The median of a list of numbers has an equal number of values less than and greater than it. The naive approach of calculating the median is to simply sort the list and pick the middle number. However, on a very large dataset, such a computation would be inefficient.

Another approach of finding a moving median is to use a combination of a minheap and a maxheap to sort the values while running through the data. We can insert numbers in either heap as they are seen, and whenever needed, the median can be calculated by adjusting the heaps to be of equal or near equal size. When the heaps are of equal size, it is simple to find the middle number, which is the median.

Getting ready

Create a file, input.txt, with some numbers:

$ cat input.txt

3
4
2
5
6
4
2
6
4
1

Also, install a library for dealing with heaps using Cabal as follows:

$ cabal install heap

How to do it…

  1. Import the heap data structure:

    import Data.Heap
    import Data.Maybe (fromJust)
  2. Convert the raw input as a list...