Mastering spaCy

By : Duygu Altınok

Mastering spaCy

By: Duygu Altınok

Overview of this book

spaCy is an industrial-grade, efficient NLP Python library. It offers various pre-trained models and ready-to-use features. Mastering spaCy provides you with end-to-end coverage of spaCy's features and real-world applications. You'll begin by installing spaCy and downloading models, before progressing to spaCy's features and prototyping real-world NLP apps. Next, you'll get familiar with visualizing with spaCy's popular visualizer displaCy. The book also equips you with practical illustrations for pattern matching and helps you advance into the world of semantics with word vectors. Statistical information extraction methods are also explained in detail. Later, you'll cover an interactive business case study that shows you how to combine all spaCy features for creating a real-world NLP pipeline. You'll implement ML models such as sentiment analysis, intent recognition, and context resolution. The book further focuses on classification with popular frameworks such as TensorFlow's Keras API together with spaCy. You'll cover popular topics, including intent classification and sentiment analysis, and use them on popular datasets and interpret the classification results. By the end of this book, you'll be able to confidently use spaCy, including its linguistic features, word vectors, and classifiers, to create your own NLP apps.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Getting Started with spaCy

Free Chapter

Chapter 1: Getting Started with spaCy

Technical requirements

Overview of spaCy

Installing spaCy

Installing spaCy's statistical models

Visualization with displaCy

Summary

Chapter 2: Core Operations with spaCy

Technical requirements

Overview of spaCy conventions

Introducing tokenization

Understanding lemmatization

spaCy container objects

More spaCy features

Summary

Section 2: spaCy Features

Chapter 3: Linguistic Features

Technical requirements

What is POS tagging?

Introduction to dependency parsing

Introducing NER

Merging and splitting tokens

Summary

Chapter 4: Rule-Based Matching

Token-based matching

PhraseMatcher

EntityRuler

Combining spaCy models and matchers

Summary

Chapter 5: Working with Word Vectors and Semantic Similarity

Technical requirements

Understanding word vectors

Using spaCy's pretrained vectors

Using third-party word vectors

Advanced semantic similarity methods

Summary

Chapter 6: Putting Everything Together: Semantic Parsing with spaCy

Technical requirements

Extracting named entities

Using dependency relations for intent recognition

Putting it all together

Summary

Section 3: Machine Learning with spaCy

Chapter 7: Customizing spaCy Models

Technical requirements

Getting started with data preparation

Annotating and preparing data

Updating an existing pipeline component

Training a pipeline component from scratch

Summary

Chapter 8: Text Classification with spaCy

Technical requirements

Understanding the basics of text classification

Training the spaCy text classifier

Sentiment analysis with spaCy

Text classification with spaCy and Keras

Summary

References

Chapter 9: spaCy and Transformers

Technical requirements

Transformers and transfer learning

Understanding BERT

Transformers and TensorFlow

Transformers and spaCy

Summary

Chapter 10: Putting Everything Together: Designing Your Chatbot with spaCy

Technical requirements

Introduction to conversational AI

Summary

Other Books You May Enjoy

Packt is searching for authors like you

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Summary

In this chapter, we explored how to customize spaCy statistical models according to our own domain and data. First, we learned the key points of deciding whether we really need custom model training. Then, we went through an essential part of statistical algorithm design – data collection, and labeling.

Here we also learned about two annotation tools – Prodigy and Brat. Next, we started model training by updating spaCy's NER component with our navigation domain data samples. We learned the necessary model training steps, including disabling the other pipeline components, creating example objects to hold our examples, and feeding our examples to the training code.

Finally, we learned how to train an NER model from scratch on a small toy dataset and on a real medical domain dataset.

With this chapter, we took a step into the statistical NLP playground. In the next chapter, we will take more steps in statistical modeling and learn about text classification...

Mastering spaCy

By : Duygu Altınok

Mastering spaCy

By: Duygu Altınok

Overview of this book

Related Content you might be interested in

Current Title:

Mastering spaCy

Python Natural Language Processing Cookbook

Natural Language Processing and Computational Linguistics

Natural Language Processing with Python Quick Start Guide

Summary