In this chapter, we will see how to run NLP algorithms over Spark. You will learn the following recipes:
Installing NLTK on Linux
Installing Anaconda on Linux
Anaconda for cluster management
POS tagging with PySpark on an Anaconda cluster
Named Entity Recognition with IPython over Spark
Implementing openNLP - chunker over Spark
Implementing openNLP - sentence detector over Spark
Implementing stanford NLP - lemmatization over Spark
Implementing sentiment analysis using stanford NLP over Spark