Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Splitting a string on lines, words, or arbitrary tokens


Useful data is often interspersed between delimiters, such as commas or spaces, making string splitting vital for most data analysis tasks.

Getting ready

Create an input.txt file similar to the following one:

$ cat input.txt

first line
second line
words are split by space
comma,separated,values
or any delimiter you want

Install the split package using Cabal as follows:

$ cabal install split

How to do it...

  1. The only function we will need is splitOn, which is imported as follows:

    import Data.List.Split (splitOn)
  2. First we split the string into lines, as shown in the following code snippet:

    main = do 
      input <- readFile "input.txt"
      let ls = lines input
      print $ ls
  3. The lines are printed in a list as follows:

    [ "first line","second line"
    , "words are split by space"
    , "comma,separated,values"
    , "or any delimiter you want"]
    
  4. Next, we separate a string on spaces as follows:

      let ws = words $ ls !! 2
      print ws
  5. The words are printed in a list...