Index
A
- AbiWord
- about / Getting ready
- above_score(score_fn, min_score) function / Scoring ngrams
- absolute link / There's more...
- AbstractLazySequence class / How it works...
- accuracy
- evaluating, of tagger / Evaluating accuracy
- a ClassiferBasedTagger class
- training / How to do it...
- AffixTagger class / Affix tagging
- min_stem_length keyword / Working with min_stem_length
- affix tagging
- about / How to do it..., How it works...
- anchor tag / How it works...
- antonym replacement
- AntonymReplacer class
- about / How it works...
- antonyms
- about / Antonyms, Replacing negations with antonyms
- negations, replacing with / Replacing negations with antonyms, How it works...
- antonyms() method / Antonyms
- append_line() function / How to do it..., How it works...
- ASCII
- converting / Converting to ASCII
- Aspell
- about / Getting ready
- URL / Getting ready
- astimezone() method / How it works...
- atomic, Redis operations
- Automatic Content Extraction / How to do it...
B
- backoff tagging
- about / See also
- taggers, combining with / Combining taggers with backoff tagging, How it works...
- backoff_tagger function / How it works..., There's more...
- backreference
- about / Getting ready
- Bag of words model
- about / Bag of words feature extraction
- bag_of_bigrams_words() function / Including significant bigrams
- bag_of_words() function / How to do it..., How it works...
- base name
- about / How to do it...
- Bayes theorem / Training a Naive Bayes classifier
- BeautifulSoup
- about / Introduction
- HTML entities, converting with / Converting HTML entities with BeautifulSoup, How to do it...
- URL, for installation / Getting ready
- URL, for usage / There's more...
- used, for extracting URLs / Extracting URLs with BeautifulSoup
- BigramCollocationFinder / How it works...
- BigramTagger class / Training and combining ngram taggers
- binary classifier
- about / Introduction, How to do it...
- binary classifiers / Classifying with multiple binary classifiers
- binary named entity extraction / Binary named entity extraction
- block readers functions, nltk.corpus.reader.util
- read_whitespace_block() / Block reader functions
- read_wordpunct_block() / Block reader functions
- read_line_block() / Block reader functions
- read_blankline_block() / Block reader functions
- read_regexp_block() / Block reader functions
- Brill tagger
- training / Training a Brill tagger, How it works...
- BrillTagger class
- about / Training a Brill tagger
- BrillTaggerTrainer class
- about / How it works...
- trace parameter, passing / Tracing
C
- cardinal
- about / Swapping noun cardinals
- Cardinal number (CD) / Tagging with regular expressions
- categorized chunk corpus read
- creating / Getting ready, How it works..., There's more...
- categorized chunk corpus reader
- creating / Creating a categorized chunk corpus reader
- CategorizedChunkedCorpusReader class / How it works...
- categorized Conll chunk corpus reader
- categorized corpora
- about / Categorized corpora
- CategorizedCorpusReader class
- about / How to do it...
- CategorizedPlaintextCorpusReader
- about / How it works...
- CategorizedPlaintextCorpusReader class
- about / How to do it...
- categorized tagged corpus reader
- about / Categorized tagged corpus reader
- categorized text corpus
- category file
- about / Category file
- cess_cat corpora / The cess_esp and cess_cat treebank
- cess_esp corpora / The cess_esp and cess_cat treebank
- channel
- about / Distributed tagging with execnet
- character encoding
- detecting / Detecting and converting character encodings, How to do it...
- converting / Detecting and converting character encodings, How to do it...
- UnicodeDammit / UnicodeDammit conversion
- charade
- about / Introduction, Getting ready
- URL / Getting ready
- ChinkRule class / How it works...
- chinks
- chi_sq() function / There's more...
- choose_tag() method / How it works...
- chunk
- chunked corpus
- analyzing / Analyzing a chunked corpus
- ChunkedCorpusReader class / How to do it..., How it works...
- chunked phrase corpus
- about / See also
- creating / Getting ready, How it works...
- chunker
- training, with NLTK-Trainer / Training a chunker with NLTK-Trainer, How to do it..., How it works...
- analyzing, against chunked corpus / Analyzing a chunker against a chunked corpus
- chunk extraction
- about / Introduction
- chunk patterns
- about / Chunking and chinking with regular expressions
- defining, with regular expressions / Getting ready, How to do it..., How it works..., There's more...
- alternative patterns, parsing / Parsing alternative patterns
- ChunkRule class / How it works...
- ChunkRule pattern / How to do it...
- chunk rules
- creating, with context / Chunk rule with context
- looping / Looping and tracing chunk rules
- tracing / Looping and tracing chunk rules
- chunks
- about / Introduction
- merging, with regular expressions / Merging and splitting chunks with regular expressions, How to do it..., How it works...
- splitting, with regular expressions / Merging and splitting chunks with regular expressions, How to do it..., How it works...
- rule descriptions, specifying / Specifying rule descriptions
- expanding, with regular expressions / Expanding and removing chunks with regular expressions, How it works..., There's more...
- removing, with regular expressions / Expanding and removing chunks with regular expressions, How it works..., There's more...
- ChunkScore metrics
- about / The ChunkScore metrics
- ChunkString / How it works...
- chunk transformations
- chaining / Chaining chunk transformations, How it works...
- chunk transforms
- about / Introduction
- chunk tree
- converting, to text / Converting a chunk tree to text, How it works...
- chunk types
- parsing / Parsing different chunk types
- chunk_tree_to_sent() function / How it works...
- class-imbalance problem / There's more...
- classification-based chunking
- about / Classification-based chunking
- performing / How to do it...
- classification probability
- about / Classification probability
- classifier-based tagging
- about / Classifier-based tagging, There's more...
- features, detecting with custom feature detector / Detecting features with a custom feature detector
- cutoff probability, setting / Setting a cutoff probability
- pre-trained classifier, using / Using a pre-trained classifier
- ClassifierBasedPOSTagger class / Classifier-based tagging
- ClassifierBasedTagger class / Classifier-based tagging, How it works...
- ClassifierChunker class
- creating / Classification-based chunking
- classifiers
- combining, with voting / Combining classifiers with voting, How to do it..., How it works...
- training, with NLTK-Trainer / Training a classifier with NLTK-Trainer, How it works...
- classify() method / How to do it...
- Cleaner class
- about / There's more...
- URL / There's more...
- clean_html() function / How to do it...
- clear() method / How it works..., How it works...
- collocations
- about / Discovering word collocations
- concatenated corpus view
- about / Concatenated corpus view
- conditional exponential classifier
- conditional frequency distribution
- storing, in Redis / Storing a conditional frequency distribution in Redis, How to do it..., How it works...
- CoNLL
- about / CoNLL2000 corpus
- CoNLL2000
- about / Introduction
- CoNLL2000 corpus
- about / CoNLL2000 corpus
- URL / CoNLL2000 corpus
- ContextTagger
- context model, overrding / Overriding the context model
- minimum frequency cutoff / Minimum frequency cutoff
- convert() function / How it works...
- convert_tree_labels() function / How to do it..., How it works...
- corpora
- about / Setting up a custom corpus
- corpus
- about / Introduction, Setting up a custom corpus
- editing, with file locking / Corpus editing with file locking, How to do it..., How it works...
- CorpusReader class / How to do it...
- corpus view
- about / Creating a custom corpus view
- corpus views / There's more...
- correct_verbs() function / How to do it..., How it works...
- cross-fold validation
- about / Cross-fold validation
- CSV synonym replacement
- about / CSV synonym replacement
- CsvWordReplacer class
- about / CSV synonym replacement
- custom corpus
- about / Setting up a custom corpus
- setting up / How to do it...
- YAML file, loading / Loading a YAML file
- training / Training on a custom corpus
- custom corpus view
- creating / Creating a custom corpus view, How to do it..., How it works...
- custom feature detector
- features, detecting with / Detecting features with a custom feature detector
- CustomSpellingReplacer class / Personal word lists
D
- data structure server
- dates
- parsing, with dateutil / Parsing dates and times with dateutil, How it works...
- dateutil
- about / Introduction
- times, parsing with / Parsing dates and times with dateutil, How it works...
- dates, parsing with / Parsing dates and times with dateutil, How it works...
- installing / Getting ready
- URL, for documentation / Getting ready
- decision tree classifier
- training / Training a decision tree classifier, How to do it..., How it works...
- uncertainty, controlling with entropy cutoff / Controlling uncertainty with entropy_cutoff
- tree depth, controlling with depth cutoff / Controlling tree depth with depth_cutoff
- decisions, controlling with support cutoff / Controlling decisions with support_cutoff
- DecisionTreeClassifier.train()
- about / There's more...
- DecisionTreeClassifier class
- about / Training a decision tree classifier, How to do it..., How it works...
- evaluating, with high information words / The DecisionTreeClassifier class with high information words
- deep tree
- flattening / Flattening a deep tree, How to do it..., How it works..., There's more...
- DefaultTagger class / Default tagging, How it works...
- default tagging
- about / Default tagging, How to do it..., How it works...
- delete command / How it works...
- depth_cutoff value
- detect() function / How it works...
- dict style feature / Bag of words feature extraction
- DictVectorizer object / How it works...
- different classifier builder
- different tagger classes
- using / Using different taggers
- distributed chunking
- used, with execnet / Distributed chunking with execnet, How to do it..., How it works..., There's more...
- Python subprocesses / Python subprocesses
- distributed tagging
- used, with execnet / Distributed tagging with execnet, How to do it..., How it works..., Creating multiple channels, Local versus remote gateways
- multiple channels, creating / Creating multiple channels
- local gateway versus remote gateway / Local versus remote gateways
- distributed word scoring
- used, with Redis / Distributed word scoring with Redis and execnet, How to do it..., How it works..., See also
- used, with execnet / Distributed word scoring with Redis and execnet, How to do it..., How it works..., See also
- dist_featx.py module
- about / How it works...
- done message
- about / How it works...
E
- ELE (Expected Likelihood Estimate)
- about / Training estimator
- ELEProbDist parameter
- about / Training estimator
- Enchant
- about / Spelling correction with Enchant
- spelling issues, correcting with / Getting ready, How it works...
- URL / Getting ready
- enchant.list_languages() method / There's more...
- English words corpus / English words corpus
- entropy
- entropy_cutoff value
- en_GB dictionary
- about / The en_GB dictionary
- estimator
- training / Training estimator
- about / Training estimator
- evaluate() method / Evaluating accuracy
- execnet
- distributed tagging, used with / Distributed tagging with execnet, How to do it..., How it works..., Creating multiple channels, Local versus remote gateways
- about / Distributed tagging with execnet
- URL / Getting ready
- distributed chunking, used with / Distributed chunking with execnet, How to do it..., How it works..., There's more...
- parallel list processing, used with / Parallel list processing with execnet, How it works..., There's more...
- distributed word scoring, used with / Distributed word scoring with Redis and execnet, How to do it..., How it works..., See also
- ExpandLeftRule / How to do it...
- ExpandRightRule / How to do it...
F
- F-measure
- about / F-measure
- false negatives
- false positives
- features
- detecting, with custom feature detector / Detecting features with a custom feature detector
- feature set
- about / Introduction
- feature_probdist constructor / How it works...
- feature_probdist variable
- about / Manual training
- file locking
- corpus, editing with / Corpus editing with file locking, How to do it..., How it works...
- filter_insignificant() function / How it works...
- first_chunk_index() function / How to do it..., How to do it..., How it works..., How it works...
- flatten_childtrees() function / How to do it..., How it works...
- flatten_deeptree() function / How it works..., How it works...
- frequency analysis
- URL, for details / Replacing synonyms
- frequency distribution
- storing, in Redis / Storing a frequency distribution in Redis, How to do it..., How it works..., There's more...
- fromstring() function / How to do it..., Parsing HTML from URLs or files
G
- gateway
- about / Distributed tagging with execnet
- gateways, API documentation
- GIS (General Iterative Scaling) / How it works...
- gis algorithm / How to do it...
H
- hash maps
- higher order function / How to do it...
- high information words
- calculating / Calculating high information words, How to do it..., How it works...
- about / Calculating high information words
- used, for evaluating MaxentClassifier class / The MaxentClassifier class with high information words
- used, for evaluating DecisionTreeClassifier class / The DecisionTreeClassifier class with high information words
- used, for evaluating SklearnClassifier class / The SklearnClassifier class with high information words
- high_information_words() function
- about / How it works...
- HTML
- URLs extracting, lxml used / Extracting URLs from HTML with lxml, How it works...
- parsing, from URLs / Parsing HTML from URLs or files
- cleaning / Cleaning and stripping HTML, How it works...
- stripping / Cleaning and stripping HTML, How it works...
- HTML entities
- converting, with BeautifulSoup / Converting HTML entities with BeautifulSoup, How to do it...
- hypernyms
- working with / Working with hypernyms
- hypernym tree / Calculating WordNet Synset similarity
- hypernym_paths() method / Working with hypernyms
- hyponyms
- about / Working with hypernyms
I
- ieer corpus / Training a named entity chunker
- IgnoreHeadingCorpusView class
- about / How it works...
- IIS (Improved Iterative Scaling) / How it works...
- infinitive phrases
- about / Swapping infinitive phrases
- swapping / How to do it..., How it works...
- Information Extraction* Entity Recognition / Training a named entity chunker
- insignificant words
- filtering, from sentence / Getting ready, How to do it...
- instance
- about / Introduction
- International Standards Organization (ISO) / Timezone lookup and conversion
- IOB tags
- about / There's more...
- items() method / How it works..., How it works...
- iterlinks() method / How to do it...
J
- jaccard() function / There's more...
K
- keys() method / How it works..., How it works...
L
- labeled feature set
- about / Introduction
- labeled feature sets
- about / Introduction
- LabelEncoder object / How it works...
- label_feats_from_corpus() function / How to do it..., How it works...
- label_probdist constructor / How it works...
- label_probdist variable
- about / Manual training
- LancasterStemmer class
- Lancaster stemming algorithm
- about / There's more...
- languages
- sentences, tokenizing in / Tokenizing sentences in other languages
- LazyCorpusLoader class / Lazy corpus loading
- about / How to do it...
- lazy corpus loading
- about / Lazy corpus loading, How to do it...
- leaves() method / Tree leaves
- lemmas
- about / Looking up lemmas and synonyms in WordNet
- looking up for / How to do it..., How it works...
- finding, with WordNetLemmatizer class / How to do it...
- lemmas() method / How to do it...
- lemmatization
- about / Lemmatizing words with WordNet
- versus stemming / There's more...
- stemming, combining with / Combining stemming with lemmatization
- less informative features
- about / Most informative features
- LinearSVC
- training with / Training with LinearSVC
- about / Training with LinearSVC
- local gateway
- versus remote gateway / Local versus remote gateways
- LocationChunker class / How to do it..., How it works...
- location chunks
- extracting / Extracting location chunks, How to do it...
- lockfile library / Getting ready
- URL, for documentation / Getting ready
- logistic regression
- about / Training with logistic regression
- training with / Training with logistic regression
- logistic regression classifier
- log likelihood / How it works...
- low information words
- lxml
- about / Introduction, Getting ready
- used, for extracting URLs from HTML / Extracting URLs from HTML with lxml, How it works...
- URL, for installation / Getting ready, Getting ready
- URL, for tutorial / How it works...
M
- masi distance / How to do it...
- MaxentClassifier class
- about / Training a maximum entropy classifier
- evaluating, with high information words / The MaxentClassifier class with high information words
- maximum entropy classifier
- max_iter variable / How it works...
- megam algorithm
- about / Megam algorithm
- URL / Megam algorithm
- MergeRule class
- Message Passing Interface (MPI)
- about / Distributed tagging with execnet
- min_lldelta variable / How it works...
- min_stem_length keyword
- working with / Working with min_stem_length
- model
- creating, of likely word tags / How to do it..., How it works...
- MongoDB
- about / Getting ready
- URL, for installation / Getting ready
- MongoDB-backed corpus reader
- MongoDBCorpusReader class / How it works...
- creating / There's more...
- more informative features
- about / Most informative features
- most informative features
- about / Most informative features
- most_informative_features() method
- about / Most informative features
- movie_reviews corpus / Getting ready
- multi-label classifier
- about / Introduction
- multilabel classifier / Classifying with multiple binary classifiers
- creating / Classifying with multiple binary classifiers
- MultinomialNB / How it works...
- multiple binary classifiers
- classifying with / Classifying with multiple binary classifiers, How to do it...
- multiple channels
- creating / Creating multiple channels
- multi_metrics() function / How to do it...
N
- Naive Bayes algorithms
- comparing / Comparing Naive Bayes algorithms
- Naive Bayes classifier
- NaiveBayesClassifier.train() method / How it works...
- NaiveBayesClassifier class / Training a Naive Bayes classifier
- NaiveBayesClassifier constructor / How it works...
- NAME chunker / How it works...
- named entities
- extracting / Extracting named entities, How it works...
- named entity chunker
- named entity recognition
- about / Extracting named entities
- NamesTagger class / How to do it...
- names wordlist corpus / Names wordlist corpus
- National Institute of Standards and Technology (NIST) / How to do it...
- negations
- replacing, with antonyms / Replacing negations with antonyms, How it works...
- negative feature sets / How it works...
- ngram
- NgramTagger class / Quadgram tagger
- ngram taggers
- training / Training and combining ngram taggers, How it works...
- combining / Training and combining ngram taggers, How it works...
- NLTK
- about / Introduction, Introduction
- URL, for installation instructions / Getting ready
- URL, for data installation / Getting ready
- URL, for starting Python console / Getting ready
- NLTK-Trainer
- about / Training a tagger with NLTK-Trainer
- URL, for documentation / Training a tagger with NLTK-Trainer
- tagger, training with / Training a tagger with NLTK-Trainer, How to do it..., How it works...
- URL, for installation instructions / Training a tagger with NLTK-Trainer
- pickled tagger, saving / Saving a pickled tagger
- training, on custom corpus / Training on a custom corpus
- used, for training chunker / Training a chunker with NLTK-Trainer, How to do it..., How it works...
- used, for training classifier / Training a classifier with NLTK-Trainer, How it works...
- pickled classifier, saving / Saving a pickled classifier
- training instances, using / Using different training instances
- most informative features / The most informative features
- Maxent classifier / The Maxent and LogisticRegression classifiers
- LogisticRegression classifier / The Maxent and LogisticRegression classifiers
- SVM classifiers / SVMs
- classifiers, combining / Combining classifiers
- high information words / High information words and bigrams
- cross-fold validation / Cross-fold validation
- classifier, analyzing / Analyzing a classifier
- nltk.chunk functions / How it works...
- nltk.corpus
- treebank corpora, defining / There's more...
- nltk.corpus.treebank_chunk corpus / Treebank chunk corpus
- nltk.data.load() function / How it works...
- nltk.metrics module / See also
- nltk.metrics package
- about / How it works...
- nltk.tag.untag() function / There's more...
- NLTK functionality
- URL, for demos / Introduction
- noun cardinals
- swapping / Swapping noun cardinals, How it works...
- Noun Phrase (NP)
- about / Creating a chunked phrase corpus
- Noun Phrases (NP)
- about / CoNLL2000 corpus
- NumPy package
- URL / Getting ready
- n_ii parameter / How it works...
- n_ix parameter / How it works...
- n_xi parameter / How it works...
- n_xx parameter / How it works...
O
- ordered dictionary
- storing, in Redis / Storing an ordered dictionary in Redis, Getting ready, How to do it..., How it works..., See also
P
- P(features) parameter / Training a Naive Bayes classifier
- P(features | label) parameter / Training a Naive Bayes classifier
- P(label) parameter / Training a Naive Bayes classifier
- P(label | features) parameter / Training a Naive Bayes classifier
- paragraph block reader
- customizing / Customizing the paragraph block reader
- parallel list processing
- used, with execnet / Parallel list processing with execnet, How it works..., There's more...
- parsed_docs() method / How it works...
- parse trees
- training / Training on parse trees
- part-of-speech tagged word corpus
- creating / Getting ready, How it works...
- part-of-speech tagging
- about / Introduction
- partial parsing
- about / Introduction
- performing, with regular expressions / Partial parsing with regular expressions, How it works...
- part of speech tagging
- Path and Leacock Chordorow (LCH) similarity / Path and Leacock Chordorow (LCH) similarity
- pattern creation
- about / Parsing alternative patterns
- Penn Treebank
- about / Introduction
- Penn Treebank corpus
- URL / See also
- Penn Treebank Project
- URL / Treebank chunk corpus
- personal word lists / Personal word lists
- PersonChunker class / There's more...
- phi_sq() function
- about / How it works...
- phrases
- about / Introduction
- pickle corpus view
- about / Pickle corpus view
- pickled chunker
- saving / Saving a pickled chunker
- pickled tagger
- trained tagger, loading with / Saving and loading a trained tagger with pickle
- saving / Saving a pickled tagger
- pivot point
- about / How to do it...
- PlaintextCorpusReader class
- about / How to do it...
- plural nouns
- singularizing / Singularizing plural nouns, How it works...
- pmi() function / There's more...
- PorterStemmer class
- about / How it works...
- Porter stemming algorithm
- about / Stemming words
- positive feature sets / How it works...
- POS tag
- about / Part of speech (POS)
- pre-trained classifier
- using / Using a pre-trained classifier
- precision
- precision and recall, MaxentClassifier class
- calculating / There's more...
- precision and recall, NaiveBayesClassifier class
- calculating / How to do it...
- precision_recall() function
- about / How to do it..., How it works...
- Prepositional Phrases (PP)
- about / CoNLL2000 corpus
- proper names
- tagging / Tagging proper names, How it works...
- proper noun chunks
- extracting / Extracting proper noun chunks, There's more...
- PunktSentenceTokenizer class / How it works..., There's more...
- PunktWordTokenizer
- about / PunktWordTokenizer
- pure function
- about / How it works...
- pure module
- about / How it works...
- PyEnchant library
- about / Getting ready
- URL / Getting ready
- PyMongo documentation
- URL / Getting ready
- Python subprocesses, distributed chunking / Python subprocesses
- PyYAML
- URL, for downloading / YAML synonym replacement
Q
- Quadgram tagger
- about / Quadgram tagger
R
- recall
- Redis
- frequency distribution, storing / Storing a frequency distribution in Redis, How to do it..., How it works..., There's more...
- URL / Getting ready
- conditional frequency distribution, storing / Storing a conditional frequency distribution in Redis, How to do it..., How it works...
- ordered dictionary, storing / Storing an ordered dictionary in Redis, How to do it..., There's more..., See also
- distributed word scoring, used with / Distributed word scoring with Redis and execnet, How to do it..., How it works..., See also
- redis-py homepage
- URL / Getting ready
- Redis commands
- URL / How it works...
- reference set
- about / How it works...
- RegexpParser class / How it works...
- about / How it works...
- RegexpReplacer class
- about / Replacement before tokenization
- RegexpStemmer class
- about / The RegexpStemmer class
- RegexpTagger class / How to do it...
- RegexpTokenizer class / How it works...
- regular expressions
- used, for tokenizing sentences / Tokenizing sentences using regular expressions, There's more...
- words, tagging with / Tagging with regular expressions, How it works...
- used, for defining chunk patterns / Chunking and chinking with regular expressions, How to do it..., How it works..., There's more...
- used, for merging chunks / Merging and splitting chunks with regular expressions, How to do it..., How it works...
- used, for splitting chunks / How to do it..., How it works...
- used, for expanding chunks / Expanding and removing chunks with regular expressions, How it works..., There's more...
- used, for removing chunks / Expanding and removing chunks with regular expressions, How it works..., There's more...
- used, for partial parsing / Partial parsing with regular expressions, How it works...
- relative link / There's more...
- remote gateway
- versus local gateway / Local versus remote gateways
- remove_line() function / How to do it..., How it works...
- repeating characters
- removing / Removing repeating characters, How it works..., There's more...
- RepeatReplacer class
- about / How it works...
- replace() method / How to do it..., How it works..., How it works...
- replacement technique
- before tokenization / Replacement before tokenization
- replace_negations() method / How it works...
- reuters_high_info_words() function / How it works...
- reuters_train_test_feats() function / How it works...
S
- scikit-learn
- scikit-learn classifiers
- scikit-learn model / There's more...
- score_ngrams(score_fn) function / Scoring ngrams
- score_words() function
- about / How it works...
- sense disambiguation
- URL, for details / Introduction
- sentence
- insignificant words, filtering from / Getting ready, How to do it...
- sentences
- text, tokenizing into / Getting ready, How to do it...
- tokenizing, in other languages / Tokenizing sentences in other languages
- tokenizing, into words / Tokenizing sentences into words, There's more...
- tokenizing, regular expressions used / Tokenizing sentences using regular expressions, There's more...
- tagging / Tagging sentences
- sentences, tokenizing into words
- contractions, separating / Separating contractions
- PunktWordTokenizer / PunktWordTokenizer
- WordPunctTokenizer / WordPunctTokenizer
- sentence tokenizer
- training / Training a sentence tokenizer, How to do it..., How it works...
- customizing / Customizing the sentence tokenizer
- sent_tokenize function / How it works...
- SequentialBackoffTagger class / How it works..., How to do it..., How to do it...
- shallow tree
- creating / Creating a shallow tree, How to do it...
- shallow_tree() function / How to do it..., How it works...
- show_most_informative_features() method
- about / Most informative features
- significant bigrams
- including / Including significant bigrams
- significant words
- singularize_plural_noun() function / How to do it...
- SklearnClassifier class
- using / Getting ready
- training / How to do it...
- working / How it works...
- evaluating, with high information words / The SklearnClassifier class with high information words
- SnowballStemmer class
- about / The SnowballStemmer class
- spelling issues
- correcting, with Enchant / Getting ready, How it works...
- SpellingReplacer class
- about / How it works...
- SplitRule class
- about / How to do it...
- split_label_feats() function / How to do it..., How it works...
- squared Pearson correlation coefficient
- reference link / How it works...
- stem() method / How to do it...
- stemming
- about / Stemming words
- versus lemmatization / There's more...
- combining, with lemmatization / Combining stemming with lemmatization
- stopwords
- about / Filtering stopwords in a tokenized sentence
- filtering, in tokenized sentence / Filtering stopwords in a tokenized sentence
- filtering / Filtering stopwords
- stopwords corpus / How it works..., See also
- about / There's more...
- StreamBackedCorpusView class / Creating a custom corpus view
- subtrees
- about / Creating a chunked phrase corpus
- sub_leaves() method / See also
- Support Vector Machines (SVM)
- about / Training with LinearSVC
- URL / Training with LinearSVC
- support_cutoff value
- swap_infinitive_phrase() function / How to do it...
- swap_noun_cardinal() function / How to do it...
- swap_verb_phrase() function / How to do it..., There's more..., How it works...
- synonyms
- looking up for / How to do it..., How it works...
- about / All possible synonyms
- words, replacing with / Replacing synonyms, How it works...
- Synset
T
- tag
- about / Introduction
- tag() method / How to do it...
- tagged corpus
- tagger, analyzing against / Analyzing a tagger against a tagged corpus
- analyzing / Analyzing a tagged corpus
- TaggedCorpusReader class / How to do it..., How it works..., Creating a custom corpus view
- tagged sentence
- untagging / Untagging a tagged sentence
- tagger
- accuracy, evaluating of / Evaluating accuracy
- training, with NLTK-Trainer / Training a tagger with NLTK-Trainer, How to do it..., How it works...
- training, with universal tags / Training with universal tags
- analyzing, against tagged corpus / Analyzing a tagger against a tagged corpus
- tagger-based chunker
- taggers
- combining, with backoff tagging / Combining taggers with backoff tagging, How it works...
- tagging
- WordNet, using for / Using WordNet for tagging, How to do it..., How it works...
- tags
- converting, to universal tagset / Converting tags to a universal tagset
- tags, for treebank corpus
- URL / There's more...
- tag separator
- customizing / Customizing the tag separator
- tagset
- tag suffixes
- passing / There's more...
- tag_equals() function / How to do it...
- tag_pattern2re_pattern() function
- about / Getting ready
- tag_sents() method / Tagging sentences
- tag_startswith() function / How to do it..., How to do it...
- test set
- about / How it works...
- text
- tokenizing, into sentences / How to do it...
- chunk tree, converting to / Converting a chunk tree to text, How it works...
- text classification
- about / Introduction
- text feature extraction
- about / Bag of words feature extraction
- working / How it works...
- text indexing
- URL, for details / Replacing synonyms
- times
- parsing, with dateutil / Parsing dates and times with dateutil, How it works...
- timezone
- obtaining / Timezone lookup and conversion, How to do it..., How it works...
- converting / Timezone lookup and conversion, How to do it..., How it works...
- local timezone, searching / Local timezone
- custom offset, creating / Custom offsets
- TnT tagger
- about / Training the TnT tagger
- training / How to do it..., How it works...
- optional keyword arguments / There's more...
- beam search, controlling / Controlling the beam search
- capitalization, significance / Significance of capitalization
- token
- about / Tokenizing text into sentences
- tokenization
- tokenized sentence
- stopwords, filtering in / Filtering stopwords in a tokenized sentence
- train() class method
- about / How it works...
- trained tagger
- saving / Saving and loading a trained tagger with pickle
- loading, with pickle / Saving and loading a trained tagger with pickle
- train_binary_classifiers() function / How it works...
- train_chunker.py script / There's more...
- train_classifier.py script
- about / How it works..., There's more...
- transform_chunk() function / How to do it..., How it works...
- Treebank chunk corpus / Treebank chunk corpus
- TreebankWordTokenizer class / Separating contractions
- treebank_chunk corpus
- using / How to do it...
- tree labels
- converting / Converting tree labels, Getting ready, How to do it...
- tree leaves
- about / Tree leaves
- tree transforms
- about / Introduction
- TrigramTagger class / Training and combining ngram taggers
- true negative
- true positive
U
- unambiguous antonyms / There's more...
- UnChunkRule pattern / How to do it...
- UnicodeDammit
- about / Introduction, UnicodeDammit conversion
- unigram
- unigram part-of-speech tagger
- training / How to do it..., How it works...
- UnigramTagger
- about / How to do it...
- UnigramTagger class
- model, constructing for / There's more...
- universal tags
- tagger, training with / Training with universal tags
- universal tagset
- tags, converting to / Converting tags to a universal tagset
- about / Converting tags to a universal tagset
- unlabeled feature set
- about / Introduction
- URLs
- extracting, from HTML with lxml / Extracting URLs from HTML with lxml, How it works...
- extracting, directly / Extracting links directly
- HTML, parsing from / Parsing HTML from URLs or files
- extracting, with xpath() method / Extracting links with XPaths
- extracting, BeautifulSoup used / Extracting URLs with BeautifulSoup
V
- values() method / How it works..., How it works...
- verb forms
- correcting / Getting ready, How to do it..., How it works...
- verb phrases
- swapping / How to do it..., How it works...
- Verb Phrases (VP)
- about / CoNLL2000 corpus
W
- whitespace tokenizer / Simple whitespace tokenizer
- word collocations
- discovering / Discovering word collocations, How to do it...
- word collocations, discovering
- functions, scoring / Scoring functions
- ngrams, scoring / Scoring ngrams
- wordlist corpus
- creating / Getting ready, How it works...
- WordListCorpusReader class / Creating a wordlist corpus, How to do it...
- WordNet
- about / Introduction, Looking up Synsets for a word in WordNet
- use cases / Introduction
- looking up for Synset / Looking up Synsets for a word in WordNet, How it works...
- looking up for lemmas / How to do it..., There's more...
- looking up for synonyms / How to do it..., How it works...
- words, lemmatizing with / Getting ready, How it works...
- using, for tagging / Using WordNet for tagging, How to do it..., How it works...
- WordNetLemmatizer class
- used, for finding lemmas / How to do it...
- about / How it works...
- WordNet Synset similarity
- calculating / How to do it..., How it works...
- verbs, comparing / Comparing verbs
- Path and Leacock Chordorow (LCH) similarity / Path and Leacock Chordorow (LCH) similarity
- WordNetTagger class / How it works...
- WordPunctTokenizer
- about / WordPunctTokenizer
- WordReplacer class / How it works...
- words
- sentences, tokenizing into / Tokenizing sentences into words, There's more...
- stemming / Stemming words, How to do it...
- lemmatizing, with WordNet / Getting ready, How it works...
- replacing, with synonyms / Replacing synonyms, How it works...
- tagging, with regular expressions / Tagging with regular expressions, How it works...
- words matching regular expressions
- word tokenizer
- customizing / Customizing the word tokenizer
- word_tag_model() function / How it works...
- word_tokenize() function / How to do it..., How it works...
- Wu-Palmer Similarity
- about / How it works...
- wup_similarity method / How it works...
X
- xpath() method
- used, for extracting URLs / Extracting links with XPaths
- reference link / Extracting links with XPaths
Y
- YAML file
- loading / Loading a YAML file
- YAML synonym replacement
- about / YAML synonym replacement
Z
- zadd command / How it works...
- zcard command / How it works...
- zrem command / How it works...
- zrevrange command / How it works...
- zscore command / How it works...
- Zset
- about / How to do it...