There is another view into the tagging probabilities; this reflects the probability assignments at the level of word. The code reflects the underlying TagLattice
and offers insights into whether the tagger is confident or not.
This recipe will focus the probability estimates on the individual token. Perform the following steps:
Type in the following on the command line or IDE equivalent:
java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar: com.lingpipe.cookbook.chapter4.ConfidenceBasedTagger
Then, enter the following:
INPUT> Colorless green ideas sleep furiously.
It yields the following output:
CONFIDENCE # Token (Prob:Tag)* 0 Colorless 0.991:jj 0.006:np$ 0.002:np 1 green 0.788:jj 0.208:nn 0.002:nns 2 ideas 1.000:nns 0.000:rb 0.000:jj 3 sleep 0.821:vb 0.101:rb 0.070:nn 4 furiously 1.000:rb 0.000:ql ...