Getting the corpus is a challenging task, but in this section, I will provide you with some of the links from which you can download a free corpus and use it to build NLP applications.
The nltk library provides some inbuilt corpus. To list down all the corpus names, execute the following commands:
import nltk.corpus dir(nltk.corpus) # Python shell print dir(nltk.corpus) # Pycharm IDE syntax
In Figure 2.2, you can see the output of the preceding code; the highlighted part indicates the name of the corpora that are already installed:
Figure 2.2: List of all available corpora in nltk
If you guys want to use IDE to develop an NLP application using Python, you can use the PyCharm community version. You can follow its installation steps by clicking on the following URL: https://github.com/jalajthanaki/NLPython/blob/master/ch2/Pycharm_installation_guide...