Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Using a Markov chain to generate text


A Markov chain is a system that predicts future outcomes of a system given current conditions. We can train a Markov chain on a corpus of data to generate new text by following the states.

A graphical representation of a chain is shown in the following figure:

Node E has a 70% probability to end up on node A, and a 30% probability to remain in place

Getting ready

Install the markov-chain library using cabal as follows:

$ cabal install markov-chain

Download a big corpus of text, and name it big.txt. In this recipe, we will be using the text downloaded from http://norvig.com/big.txt.

How to do it…

  1. Import the following packages:

    import Data.MarkovChain
    import System.Random (mkStdGen)
  2. Train a Markov chain on a big input of text and then run it as follows:

    main = do
    rawText <- readFile "big.txt"
    let g = mkStdGen 100
    putStrLn $ "Character by character: \n"
    putStrLn $ take 100 $ run 3 rawText 0 g
    putStrLn $ "\nWord by word: \n"
    putStrLn $ unwords $ take 100 $ run...