Index
A
- access token, Facebook
- URL / Facebook
- associative arrays/memories
- about / Dictionaries
B
- Bag of word (BOW) representation
- about / Text classification
- bar plot
- about / A bar plot
- Boolean retrieval
- about / Boolean retrieval
- BOW (Bag of Word) representation
- about / Sampling
- Brill tagger
- about / Brill tagger
C
- CART
- about / Decision trees
- chart parser
- about / A chart parser
- Chrome / The Scrapy shell
- chunking
- about / Chunking
- classification
- about / Machine learning
- complex matrix operations, NumPy
- performing / Complex matrix operations
- reshaping / Reshaping and stacking
- stacking / Reshaping and stacking
- random numbers, generating / Random numbers
- Conditional Random Field (CRF)
- about / Machine learning based tagger
- css() method
- about / The Scrapy shell
D
- 3D plot
- about / 3D plots
- D3
- about / Geovisualization
- data collection
- about / Data collection
- Twitter / Twitter
- data extraction
- about / Data extraction
- trending topics, searching in Twitter / Trending topics
- data flow, Scrapy
- about / Data flow in Scrapy
- Scrapy shell / The Scrapy shell
- items / Items
- data munging
- about / What is text wrangling?
- data skewness
- about / Sampling
- decision trees
- about / Decision trees
- deep parsing
- versus shallow parsing / Shallow versus deep parsing
- dependencies
- about / Dependency parsing
- dependency parser
- about / Stemming
- dependency parsing (DP)
- about / Dependency parsing
- dialog systems
- about / Dialog systems
- dictionaries
- about / Dictionaries
- dir() function
- about / Helping yourself
- direct translation
- about / Machine translation
E
- eigenvalues
- about / eigenvalues and eigenvectors
- eigenvectors
- about / eigenvalues and eigenvectors
- exploratory data analysis (EDA)
- about / Diving into NLTK
- extract() method
- about / The Scrapy shell
F
- Facebook SDK
- Firebug / The Scrapy shell
G
- gensim
- installing / Installing gensim
- URL / Installing gensim
- geomap
- about / Influencers detection
- geo visualization
- about / Geovisualization
- influencers detection, in Twitter / Influencers detection
- Facebook / Facebook
- influencer friends, searching in social media / Influencer friends
- Google news
- URL / The Scrapy shell
H
- Hadoop
- scikit-learn / Scikit-learn on Hadoop
- help() function
- about / Helping yourself
- Hidden Markov Model (HMM)
- about / Machine learning based tagger
- Hindi stemmer
- reference link / Stemming
- Hive/Pig UDF
- about / Hive/Pig UDF
- Hive UDF
- used, for running NLTK on Hadoop / A UDF
I
- IE engine
- about / Information extraction
- importance score
- calculating / Building your first NLP application
- information extraction
- about / Information extraction
- named-entity recognition (NER) / Named-entity recognition (NER)
- information extraction (IE)
- about / Information extraction
- rule-based extraction / Information extraction
- machine learning based / Information extraction
- information retrieval (IR)
- about / Information retrieval
- Boolean retrieval / Boolean retrieval
- vector space model (VSM) / Vector space model
- probabilistic model / The probabilistic model
- reference link / The probabilistic model
- inverse document frequency (IDF)
- about / Vector space model
- inverted index
- about / Information retrieval
- item pipeline
- building / The item pipeline
- items
J
- Json Parser
- URL / Data extraction
K
- K-means clustering
- about / K-means
- KLOUT
- URL / Influencers detection
L
- language detection
- about / Language detection
- Latent Dirichlet allocation (LDA)
- about / Topic modeling in text
- latent dirichlet allocation (LDA)
- about / Topic modeling
- latent semantics indexing (LSI)
- about / Topic modeling
- lemmatization
- about / What is text wrangling?, Lemmatization
- linear algebra
- about / Linear algebra
- reference link / Linear algebra
- Linguistic Data Consortium (LDC)
- about / Diving deep into a tagger
- URL / Diving deep into a tagger
- lists
- about / Lists
- logistic regression
- about / Logistic regression
M
- machine learning
- about / Machine learning
- supervised learning / Machine learning
- unsupervised learning / Machine learning
- semi-supervised learning / Machine learning
- reinforcement learning / Machine learning
- machine learning based extraction
- about / Information extraction
- machine learning based tagger
- about / Machine learning based tagger
- machine translation
- about / Machine translation
- direct translation / Machine translation
- syntactic transfer / Machine translation
- MapReduce
- reference link / Python streaming
- matplotlib
- about / matplotlib
- subplot / Subplot
- axis, adding / Adding an axis
- scatter plot / A scatter plot
- bar plot / A bar plot
- 3D plot / 3D plots
- URL / External references
- maximum entropy (MaxEnt)
- about / Stochastic gradient descent
- Maximum Entropy Classifier (MEC)
- about / Machine learning based tagger
- ML (Machine learning)
- about / Text classification
N
- N-gram tagger
- about / N-gram tagger
- Naive Bayes
- about / Naive Bayes
- reference link / Naive Bayes
- named-entity recognition (NER)
- about / Named-entity recognition (NER)
- ndarray
- about / ndarray
- indexing / Indexing
- data, extracting / Extracting data from an array
- NER
- about / Stemming, Named Entity Recognition (NER)
- NER tagger
- about / NER tagger
- reference link / NER tagger
- NetworkX
- about / Influencer friends
- URL / Influencer friends
- NLP
- need for / Why learn NLP?
- tools / Why learn NLP?
- NLP application
- building / Building your first NLP application
- other applications / Other NLP applications
- machine translation / Machine translation
- statistical machine translation (SMT) / Statistical machine translation
- information retrieval (IR) / Information retrieval
- speech recognition / Speech recognition
- text classification / Text classification
- information extraction (IE) / Information extraction
- question answering (QA) systems / Question answering systems
- dialog systems / Dialog systems
- word sense disambiguation (WSD) / Word sense disambiguation
- topic modeling / Topic modeling
- language detection / Language detection
- optical character recognition (OCR) / Optical character recognition
- NLTK
- about / Why learn NLP?, Diving into NLTK
- URL / Why learn NLP?
- example / Diving into NLTK
- NLTK, on Hadoop
- using / NLTK on Hadoop
- Hive UDF, using / A UDF
- Python, streaming / Python streaming
- noun phrase (NP)
- about / Chunking
- NumPy
- about / NumPy
- ndarray / ndarray
- basic operations / Basic operations
- complex matrix operations / Complex matrix operations
- URL / External references
- NumPy array
- about / Decision trees
O
- optical character recognition (OCR)
- about / Optical character recognition
P
- pandas
- about / pandas
- data, reading / Reading data
- series data / Series data
- column transformation / Column transformation
- noisy data / Noisy data
- URL / External references
- parsers
- about / Different types of parsers
- recursive descent parser / A recursive descent parser
- shift-reduce parser / A shift-reduce parser
- chart parser / A chart parser
- regex parser / A regex parser
- parsing
- rule-based approach / The two approaches in parsing
- probabilistic approach / The two approaches in parsing
- need for / Why we need parsing
- dependency parsing (DP) / Dependency parsing
- part of speech (POS) tagging
- about / What is Part of speech tagging, Diving deep into a tagger
- reference link / What is Part of speech tagging, Machine learning based tagger
- Stanford tagger / Stanford tagger
- sequential tagger / Sequential tagger
- Brill tagger / Brill tagger
- machine learning based tagger / Machine learning based tagger
- Part of Speech tagger (POS)
- about / Stemming
- petabytes
- about / Why learn NLP?
- phonemes
- about / Speech recognition
- phrase structure parsing
- about / Dependency parsing
- Porter stemmer
- about / Stemming
- probabilistic approach, parsing
- about / The two approaches in parsing
- probabilistic context-free grammar (PCFG)
- about / Shallow versus deep parsing
- probabilistic dependency parser
- about / Dependency parsing
- probabilistic model
- about / The probabilistic model
- projective dependency parser
- about / Dependency parsing
- PySpark
- Python
- URL / Why learn NLP?, Let's start playing with Python!
- using / Let's start playing with Python!
- lists / Lists
- help() function / Helping yourself
- dir() function / Helping yourself
- regular expression / Regular expressions
- dictionaries / Dictionaries
- reference link / Writing functions
- streaming, for running NLTK on Hadoop / Python streaming
- Python, on Hadoop
- using / Different ways of using Python on Hadoop
- Python, streaming / Python streaming
- Hive/Pig UDF / Hive/Pig UDF
- wrappers, streaming / Streaming wrappers
- reference link / Streaming wrappers
Q
- question answering (QA) systems
- about / Question answering systems
R
- random forest
- about / The Random forest algorithm
- rare word
- removing / Rare word removal
- re() method
- about / The Scrapy shell
- recursive descent parser
- about / A recursive descent parser
- regex parser
- about / A regex parser
- regex tagger
- about / Regex tagger
- regression
- about / Machine learning
- regular expression
- about / Regular expressions
- reinforcement learning
- about / Machine learning
- rule-based approach, parsing
- about / The two approaches in parsing
- rule-based extraction
- about / Information extraction
S
- sampling
- about / Sampling
- reference link / Sampling
- example / Sampling
- Naive Bayes / Naive Bayes
- decision trees / Decision trees
- stochastic gradient descent (SGD) / Stochastic gradient descent
- logistic regression / Logistic regression
- support vector machines (SVM) / Support vector machines
- scatter plot
- about / A scatter plot
- scikit-learn
- URL, for scikit classes / Naive Bayes
- on Hadoop / Scikit-learn on Hadoop
- SciPy
- about / SciPy
- linear algebra / Linear algebra
- eigenvalues / eigenvalues and eigenvectors
- eigenvectors / eigenvalues and eigenvectors
- sparse matrix / The sparse matrix
- optimization / Optimization
- URL / External references
- Scrapy
- installing / Writing your first crawler
- URL / Writing your first crawler
- data flow / Data flow in Scrapy
- external references / External references
- Scrapy shell
- about / The Scrapy shell
- using / The Scrapy shell
- semi-supervised learning
- about / Machine learning
- sentence splitter
- about / Sentence splitter
- sequential taggers
- about / Sequential tagger
- N-grams tagger / N-gram tagger
- regex tagger / Regex tagger
- shallow parsing
- versus deep parsing / Shallow versus deep parsing
- shift-reduce parser
- about / A shift-reduce parser
- singular value decomposition (SVD)
- about / eigenvalues and eigenvectors
- Sitemap spider
- about / The Sitemap spider
- Snowball stemmers
- about / Stemming
- sparse matrix
- about / The sparse matrix
- DOK (Dictionary of keys) / The sparse matrix
- LOL (list of list) / The sparse matrix
- COL (Coordinate list) / The sparse matrix
- CRS/CSR (Compressed row Storage) / The sparse matrix
- URL / The sparse matrix
- CSC (sparse column) / The sparse matrix
- specific preprocessing
- about / What is text wrangling?
- speech recognition
- about / Speech recognition
- spell correction
- with spellchecker / Spell correction
- Stanford parser
- URL / Dependency parsing
- Stanford tagger
- about / Stanford tagger
- Stanford tools
- about / Stanford tagger
- reference link / Stanford tagger
- statistical machine translation (SMT)
- about / Statistical machine translation
- stemming
- about / What is text wrangling?, Stemming
- reference link / Stemming
- stochastic gradient descent (SGD)
- about / Stochastic gradient descent
- stop word removal
- about / What is text wrangling?, Stop word removal
- implementing / Stop word removal
- string functions
- split / Helping yourself
- strip / Helping yourself
- upper/lower / Helping yourself
- replace / Helping yourself
- reference link / Helping yourself
- subplot
- about / Subplot
- summarization
- supervised learning
- about / Machine learning
- classification / Machine learning
- regression / Machine learning
- support vector machines (SVM)
- about / Support vector machines
- syntactic parser
- about / Why we need parsing
- syntactic transfer
- about / Machine translation
T
- term-document matrix
- about / Sampling
- term doc matrix (TDM)
- about / Text classification
- term frequencies (tf)
- about / Sampling
- term frequency (TF)
- about / Vector space model
- term frequency-inverse document frequency (tf-idf)
- text-processing
- reference link / Tokenization
- text classification
- about / Text classification, Text classification
- text cleansing
- about / What is text wrangling?, Text cleansing
- text clustering
- about / Text clustering
- K-means clustering / Text clustering, K-means
- hierarchical clustering / Text clustering
- text wrangling
- about / What is text wrangling?
- TF-IDF corpus
- about / Installing gensim
- tokenization
- about / What is text wrangling?, Tokenization
- topic modeling
- about / Topic modeling, Topic modeling in text
- gensim, installing / Installing gensim
- tuple
- about / What is Part of speech tagging
- Tweepy
- Twitter
- about / Twitter
- data, gathering / Twitter
- trending topics, searching / Trending topics
- influencers, detecting / Influencers detection
- Twitter libraries
- URL / Twitter
U
- Udacity
- URL / Web crawlers
- unsupervised learning
- about / Machine learning
- user defined function (UDF)
- about / Hive/Pig UDF
V
- vector space model (VSM)
- about / Vector space model
- verb phrase (VP)
- about / Chunking
W
- web crawler
- about / Web crawlers
- writing / Writing your first crawler
- word sense disambiguation (WSD)
- about / Word sense disambiguation
- World Wide Web (WWW)
- about / Web crawlers
- wrappers
- streaming / Streaming wrappers
X
- XPath
- about / The Scrapy shell
- xpath() method
- about / The Scrapy shell