The power and appeal of CRFs comes from rich feature extraction—proceed with an evaluation harness that provides feedback on your explorations. This recipe will detail how to create more complex features.
We will not train and run a CRF; instead, we will print out the features. Substitute this feature extractor for the one in the previous recipe to see them at work. Perform the following steps:
Go to a command line and type:
java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar: com.lingpipe.cookbook.chapter4.ModifiedCrfFeatureExtractor
The feature extractor class outputs for each token in the training data the truth tagging that is being used to learn:
------------------- Tagging: John/PN
This reflects the training tagging for the token
John
as determined bysrc/com/lingpipe/cookbook/chapter4/TinyPosCorpus.java
.The node features follow the top-three POS tags from our Brown corpus HMM tagger and the
TOK_John
feature:Node Feats:{nps=2.0251355582754984E-4,...