Parts of speech tagging
In many cases, NLP processing depends on determining the parts of speech of the words in the text. For example, in sentence classification, we sometimes use the parts of speech of the words as a feature that is input to the classifier. In this recipe, we will again consider the NLTK and spaCy algorithms.
Getting ready
For this recipe, we will be using the same text of the book The Adventures of Sherlock Holmes. You can find the whole text in the book's GitHub. For this recipe, we will need just the beginning of the book, which can be found in the sherlock_holmes_1.txt
file.
In order to do this task, you will need the spaCy package, described in the Technical requirements section.
How to do it…
In this recipe, we will use the spaCy package to label words with their parts of speech, and I will show that it is superior to NLTK in this task.
The process is as follows:
- Import the
spacy
package:import spacy
- Read in the book...