-
Book Overview & Buying
-
Table Of Contents
Python 3 Text Processing with NLTK 3 Cookbook - Second Edition
By :
The following is a table of all the part-of-speech tags that occur in the treebank corpus distributed with NLTK. The tags and counts shown here were acquired using the following code:
>>> from nltk.probability import FreqDist >>> from nltk.corpus import treebank >>> fd = FreqDist() >>> for word, tag in treebank.tagged_words(): ... fd[tag] += 1 >>> fd.items()
The FreqDist fd contains all the counts shown here for every tag in the treebank corpus. You can inspect each tag count individually, by doing fd[tag], for example, fd['DT']. Punctuation tags are also shown, along with special tags such as -NONE-, which signifies that the part-of-speech tag is unknown. Descriptions of most of the tags can be found at the following link:
http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
|
Part-of-speech tag |
Frequency of occurrence |
|---|---|
|
|
|
|
|
|
|
|
Change the font size
Change margin width
Change background colour