-
Book Overview & Buying
-
Table Of Contents
Python 3 Text Processing with NLTK 3 Cookbook - Second Edition
By :
TnT stands for Trigrams'n'Tags. It is a statistical tagger based on second order Markov models. The details of this are out of the scope of this book, but you can read more about the original implementation at http://www.coli.uni-saarland.de/~thorsten/tnt/.
The TnT tagger has a slightly different API than the previous taggers we've encountered. You must explicitly call the train() method after you've created it. Here's a basic example.
>>> from nltk.tag import tnt >>> tnt_tagger = tnt.TnT() >>> tnt_tagger.train(train_sents) >>> tnt_tagger.evaluate(test_sents) 0.8756313403842003
It's quite a good tagger all by itself, only slightly less accurate than the BrillTagger class from the previous recipe. But if you do not call train() before evaluate(), you'll get an accuracy of 0%.
The TnT tagger maintains a number of internal FreqDist and ConditionalFreqDist instances based on the training data. These frequency...
Change the font size
Change margin width
Change background colour