Book Image

Haskell Data Analysis Cookbook

By : Nishant Shukla
Book Image

Haskell Data Analysis Cookbook

By: Nishant Shukla

Overview of this book

Table of Contents (19 chapters)
Haskell Data Analysis Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Identifying key words in a corpus of text


One way to predict the topic of a paragraph or sentence is by identifying what the words mean. While the parts of speech give some insight about each word, they still don't reveal the connotation of that word. In this recipe, we will use a Haskell library to tag words by topics such as PERSON, CITY, DATE, and so on.

Getting ready

An Internet connection is necessary for this recipe to download the sequor package.

Install it from cabal:

$ cabal install sequor --prefix=`pwd`

Otherwise, follow these directions to install it manually:

  1. Obtain the latest version of the sequor library by opening up a browser and visiting the following URL: http://hackage.haskell.org/package/sequor.

  2. Under the Downloads section, download the cabal source package.

  3. Extract the contents:

    • On Windows, it is easiest to using 7-Zip, an easy-to-use file archiver. Install it on your machine by going to http://www.7-zip.org. Then using 7-Zip, extract the contents of the tarball.

    • On other operating...