In addition to UnigramTagger
, there are two more
NgramTagger
subclasses: BigramTagger
and
TrigramTagger
. BigramTagger
uses the previous tag as part of its context, while TrigramTagger
uses the previous two tags. An ngram is a subsequence of n items, so the BigramTagger
looks at two items (the previous tag and word), and the TrigramTagger
looks at three items.
These two taggers are good at handling words whose part-of-speech tag is context dependent. Many words have a different part-of-speech depending on how they are used. For example, we have been talking about taggers that "tag" words. In this case, "tag" is used as a verb. But the result of tagging is a part-of-speech tag, so "tag" can also be a noun. The idea with the NgramTagger
subclasses is that by looking at the previous words and part-of-speech tags, we can better guess the part-of-speech tag for the current word.