Words and tokens are the focus of this chapter. The more common extraction technologies, such as named entity recognition, are actually encoded into the concepts presented here, but this will have to wait until Chapter 5, Finding Spans in Text – Chunking. We will start easy with finding interesting sets of tokens. Then, we will move on to HMM and finish with one of the most complex components of LingPipe—CRF. As usual, we show you how to evaluate tagging and train your own taggers.
Natural Language Processing with Java and LingPipe Cookbook
Natural Language Processing with Java and LingPipe Cookbook
Overview of this book
Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Simple Classifiers
Finding and Working with Words
Advanced Classifiers
Tagging Words and Tokens
Finding Spans in Text – Chunking
String Comparison and Clustering
Finding Coreference Between Concepts/People
Index
Customer Reviews