Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Modifying CRFs


The power and appeal of CRFs comes from rich feature extraction—proceed with an evaluation harness that provides feedback on your explorations. This recipe will detail how to create more complex features.

How to do it...

We will not train and run a CRF; instead, we will print out the features. Substitute this feature extractor for the one in the previous recipe to see them at work. Perform the following steps:

  1. Go to a command line and type:

    java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar: com.lingpipe.cookbook.chapter4.ModifiedCrfFeatureExtractor
    
  2. The feature extractor class outputs for each token in the training data the truth tagging that is being used to learn:

    -------------------
    Tagging:  John/PN
    
  3. This reflects the training tagging for the token John as determined by src/com/lingpipe/cookbook/chapter4/TinyPosCorpus.java.

  4. The node features follow the top-three POS tags from our Brown corpus HMM tagger and the TOK_John feature:

    Node Feats:{nps=2.0251355582754984E-4,...