Parts of speech tagging is one of the important tasks of text analysis. It helps tag each word based on the context of a sentence or the role that a word plays in a sentence.
Let's see how to perform part of speech tagging using nltk
:
>>> pos_word_data = nltk.pos_tag(word_data) >>> pos_word_data[ : 10] [('Pundits', 'NNS'), ('and', 'CC'), ('critics', 'NNS'), ('like', 'IN'), ('to', 'TO'), ('blame', 'VB'), ('the', 'DT'), ('twin', 'NN'), ('successes', 'NNS'), ('of', 'IN')]
You can see tags, such as NNS
, CC
, IN
, TO
, DT
, and NN
. Let's see what they mean using this code:
>>> nltk.help.upenn_tagset('NNS') NNS: noun, common, plural undergraduates scotches bric-a-brac products bodyguards facets coasts divestitures storehouses designs clubs fragrances averages subjectivists apprehensions muses factory-jobs >>> nltk.help.upenn_tagset('NN') NN: noun, common, singular or mass common-carrier cabbage knuckle-duster Casino afghan shed...