Mastering spaCy

By : Duygu Altınok

Mastering spaCy

By: Duygu Altınok

Overview of this book

spaCy is an industrial-grade, efficient NLP Python library. It offers various pre-trained models and ready-to-use features. Mastering spaCy provides you with end-to-end coverage of spaCy's features and real-world applications. You'll begin by installing spaCy and downloading models, before progressing to spaCy's features and prototyping real-world NLP apps. Next, you'll get familiar with visualizing with spaCy's popular visualizer displaCy. The book also equips you with practical illustrations for pattern matching and helps you advance into the world of semantics with word vectors. Statistical information extraction methods are also explained in detail. Later, you'll cover an interactive business case study that shows you how to combine all spaCy features for creating a real-world NLP pipeline. You'll implement ML models such as sentiment analysis, intent recognition, and context resolution. The book further focuses on classification with popular frameworks such as TensorFlow's Keras API together with spaCy. You'll cover popular topics, including intent classification and sentiment analysis, and use them on popular datasets and interpret the classification results. By the end of this book, you'll be able to confidently use spaCy, including its linguistic features, word vectors, and classifiers, to create your own NLP apps.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Getting Started with spaCy

Free Chapter

Chapter 1: Getting Started with spaCy

Technical requirements

Overview of spaCy

Installing spaCy

Installing spaCy's statistical models

Visualization with displaCy

Summary

Chapter 2: Core Operations with spaCy

Technical requirements

Overview of spaCy conventions

Introducing tokenization

Understanding lemmatization

spaCy container objects

More spaCy features

Summary

Section 2: spaCy Features

Chapter 3: Linguistic Features

Technical requirements

What is POS tagging?

Introduction to dependency parsing

Introducing NER

Merging and splitting tokens

Summary

Chapter 4: Rule-Based Matching

Token-based matching

PhraseMatcher

EntityRuler

Combining spaCy models and matchers

Summary

Chapter 5: Working with Word Vectors and Semantic Similarity

Technical requirements

Understanding word vectors

Using spaCy's pretrained vectors

Using third-party word vectors

Advanced semantic similarity methods

Summary

Chapter 6: Putting Everything Together: Semantic Parsing with spaCy

Technical requirements

Extracting named entities

Using dependency relations for intent recognition

Putting it all together

Summary

Section 3: Machine Learning with spaCy

Chapter 7: Customizing spaCy Models

Technical requirements

Getting started with data preparation

Annotating and preparing data

Updating an existing pipeline component

Training a pipeline component from scratch

Summary

Chapter 8: Text Classification with spaCy

Technical requirements

Understanding the basics of text classification

Training the spaCy text classifier

Sentiment analysis with spaCy

Text classification with spaCy and Keras

Summary

References

Chapter 9: spaCy and Transformers

Technical requirements

Transformers and transfer learning

Understanding BERT

Transformers and TensorFlow

Transformers and spaCy

Summary

Chapter 10: Putting Everything Together: Designing Your Chatbot with spaCy

Technical requirements

Introduction to conversational AI

Summary

Other Books You May Enjoy

Packt is searching for authors like you

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Understanding lemmatization

A lemma is the base form of a token. You can think of a lemma as the form in which the token appears in a dictionary. For instance, the lemma of eating is eat; the lemma of eats is eat; ate similarly maps to eat. Lemmatization is the process of reducing the word forms to their lemmas. The following code is a quick example of how to do lemmatization with spaCy:

 import spacy
 nlp = spacy.load("en_core_web_md")
 doc = nlp("I went there for working and worked for 3 years.")
 for token in doc:
     print(token.text, token.lemma_)
I -PRON-
went go
there
for for
working work
and and
worked work
for for
3 3
years year
. .

By now, you should be familiar with what the first three lines of the code do. Recall that we import the spacy library, load an English model using spacy.load, create a pipeline, and apply the pipeline to the preceding sentence to get a Doc object. Here we iterated over tokens to get their text and...

Mastering spaCy

By : Duygu Altınok

Mastering spaCy

By: Duygu Altınok

Overview of this book

Related Content you might be interested in

Current Title:

Mastering spaCy

Python Natural Language Processing Cookbook

Natural Language Processing and Computational Linguistics

Natural Language Processing with Python Quick Start Guide

Understanding lemmatization