Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

N-best word tagging


The certainty-driven nature of Computer Science is not reflected in the vagaries of linguistics where reasonable PhDs can agree or disagree at least until Chomsky's henchmen show up. This recipe uses the same HMM trained in the preceding recipe but provides a ranked list of possible tags for each word.

Where might this be helpful? Imaging a search engine that searched for words and a tag—not necessarily part-of-speech. The search engine can index the word and the top n-best tags that will allow a match into a non-first best tag. This can help increase recall.

How to do it...

N-best analyses push the sophistication boundaries of NLP developers. What used to be a singleton is now a ranked list, but it is where the next level of performance occurs. Let's get started by performing the following steps:

  1. Put away your copy of Syntactic Structures face down and type out the following:

    java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar: com.lingpipe.cookbook.chapter4.NbestPosTagger...