Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Hidden Markov Models (HMM) – part-of-speech


This recipe brings in the first hard-core linguistic capability of LingPipe; it refers to the grammatical category for words or part-of-speech (POS). What are the verbs, nouns, adjectives, and so on in text?

How to do it...

Let's jump right in and drag ourselves back to those awkward middle-school years in English class or our equivalent:

  1. As always, head over to your friendly command prompt and type the following:

    java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar: com.lingpipe.cookbook.chapter9.PosTagger 
    
  2. The system will respond with a prompt to which we will add a Jorge Luis Borges quote:

    INPUT> Reality is not always probable, or likely.
    
  3. The system will respond delightfully to this quote with:

    Reality_nn is_bez not_* always_rb probable_jj ,_, or_cc likely_jj ._. 
    

Appended to each token is _ with a part-of-speech tag; nn is noun, rb is adverb, and so on. The complete tag set and description of the corpus of the tagger can be found at http...