Chapter 2: Core Operations with spaCy

Book Overview & Buying
Table Of Contents

Mastering spaCy - Second Edition

By : Déborah Mesquita, Duygu Altinok

5 (1)

Buy this Book

Mastering spaCy

5 (1)

By: Déborah Mesquita, Duygu Altinok

Buy this Book

Overview of this book

Mastering spaCy, Second Edition is your comprehensive guide to building sophisticated NLP applications using the spaCy ecosystem. This revised edition builds on the expertise of Duygu Altinok, a seasoned NLP engineer and spaCy contributor, and introduces new chapters by Déborah Mesquita, a data science educator and consultant known for making complex concepts accessible. This edition embraces the latest advancements in NLP, featuring chapters on large language models with spacy-llm, transformer integration, and end-to-end workflow management with Weasel. You’ll learn how to enhance NLP tasks using LLMs, streamline workflows using Weasel, and integrate spaCy with third-party libraries like Streamlit, FastAPI, and DVC. From training custom Named Entity Recognition (NER) pipelines to categorizing emotions in Reddit posts, this book covers advanced topics such as text classification and coreference resolution. Starting with the fundamentals—tokenization, NER, and dependency parsing—you’ll explore more advanced topics like creating custom components, training domain-specific models, and building scalable NLP workflows. Through practical examples, clear explanations, tips, and tricks, this book will equip you to build robust NLP pipelines and seamlessly integrate them into web applications for end-to-end solutions.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Free Chapter

Part 1: Getting Started with spaCy

Chapter 1: Getting Started with spaCy

Technical requirements

Overview of spaCy

Installing spaCy

Installing spaCy’s language models

Visualization with displaCy

Summary

Chapter 2: Core Operations with spaCy

Technical requirements

Overview of spaCy conventions

Introducing Tokenization

Understanding lemmatization

spaCy container objects

More spaCy Token features

Summary

Part 2: Advanced Linguistic and Semantic Analysis

Chapter 3: Extracting Linguistic Features

Technical requirements

What is POS tagging?

Introduction to dependency parsing

Introducing NER

Merging and splitting tokens

Summary

Chapter 4: Mastering Rule-Based Matching

Technical requirements

Token-based matching

Creating patterns with PhraseMatcher

Creating patterns with SpanRuler

Combining spaCy models and matchers

Summary

Chapter 5: Extracting Semantic Representations with spaCy Pipelines

Technical requirements

Extracting named entities with SpanRuler

Extracting dependency relations with DependencyMatcher

Creating a pipeline component using extension attributes

Running the pipeline with large datasets

Summary

Chapter 6: Utilizing spaCy with Transformers

Technical requirements

Transformers and transfer learning

Text classification with spaCy

Using Hugging Face transformers in spaCy

Summary

Part 3: Customizing and Integrating NLP Workflows

Chapter 7: Enhancing NLP Tasks Using LLMs with spacy-llm

Technical requirements

LLMs and prompt engineering basics

Text summarization with LLMs and spacy-llm

Creating custom spacy-llm tasks

Summary

Chapter 8: Training an NER Component with Your Own Data

Technical requirements

Getting started with data preparation

Training an NER pipeline component

Combining multiple NER components in the same pipeline

Summary

Chapter 9: Creating End-to-End spaCy Workflows with Weasel

Technical requirements

Cloning and running a project template with Weasel

Modifying a project template for a different use case

Managing models with the DVC model registry

Summary

Chapter 10: Training an Entity Linker Model with spaCy

Technical requirements

Understanding the entity linking task

Best practices for creating a good NLP corpus

Training an EntityLinker component with spaCy

Training with a custom corpus reader

Summary

Chapter 11: Integrating spaCy with Third-Party Libraries

Technical requirements

Building spaCy-powered Apps with Streamlit

Building APIs for NLP models using FastAPI

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Mastering spaCy - Second Edition

By : Déborah Mesquita, Duygu Altinok

Mastering spaCy

By: Déborah Mesquita, Duygu Altinok

Overview of this book

Introducing Tokenization

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access