The BrillTagger
class is a transformation-based tagger. It is the first tagger that is not a subclass of SequentialBackoffTagger
. Instead, the BrillTagger
class uses a series of rules to correct the results of an initial tagger. These rules are scored based on how many errors they correct minus the number of new errors they produce.
Here's a function from tag_util.py
that trains a BrillTagger
class using BrillTaggerTrainer
. It requires an initial_tagger
and train_sents
.
from nltk.tag import brill, brill_trainer def train_brill_tagger(initial_tagger, train_sents, **kwargs): templates = [ brill.Template(brill.Pos([-1])), brill.Template(brill.Pos([1])), brill.Template(brill.Pos([-2])), brill.Template(brill.Pos([2])), brill.Template(brill.Pos([-2, -1])), brill.Template(brill.Pos([1, 2])), brill.Template(brill.Pos([-3, -2, -1])), brill.Template(brill.Pos([1, 2, 3])), brill.Template(brill.Pos([-1]), brill.Pos([1])), ...