Book Image

Mastering spaCy

By : Duygu Altınok
Book Image

Mastering spaCy

By: Duygu Altınok

Overview of this book

spaCy is an industrial-grade, efficient NLP Python library. It offers various pre-trained models and ready-to-use features. Mastering spaCy provides you with end-to-end coverage of spaCy's features and real-world applications. You'll begin by installing spaCy and downloading models, before progressing to spaCy's features and prototyping real-world NLP apps. Next, you'll get familiar with visualizing with spaCy's popular visualizer displaCy. The book also equips you with practical illustrations for pattern matching and helps you advance into the world of semantics with word vectors. Statistical information extraction methods are also explained in detail. Later, you'll cover an interactive business case study that shows you how to combine all spaCy features for creating a real-world NLP pipeline. You'll implement ML models such as sentiment analysis, intent recognition, and context resolution. The book further focuses on classification with popular frameworks such as TensorFlow's Keras API together with spaCy. You'll cover popular topics, including intent classification and sentiment analysis, and use them on popular datasets and interpret the classification results. By the end of this book, you'll be able to confidently use spaCy, including its linguistic features, word vectors, and classifiers, to create your own NLP apps.
Table of Contents (15 chapters)
1
Section 1: Getting Started with spaCy
4
Section 2: spaCy Features
9
Section 3: Machine Learning with spaCy

What this book covers

Chapter 1, Getting Started with spaCy, begins your spaCy journey. This chapter gives you an overview of NLP with Python. In this chapter, you'll install the spaCy library and spaCy language models and explore displaCy, spaCy's visualization tool. Overall, this chapter will get you started with installing and understanding the spaCy library.

Chapter 2, Core Operations with spaCy, teaches you the core operations of spaCy, such as creating a language pipeline, tokenizing the text, and breaking the text into its sentences as well as the Container classes. The Container classes token, Doc, and Span are covered in this chapter in detail.

Chapter 3, Linguistic Features, takes a deep dive into spaCy's full power. This chapter explores the linguistic features, including spaCy's most used features, such as POS-tagger, dependency parser, named entity recognizer, and merging/splitting.

Chapter 4, Rule-Based Matching, teaches you how to extract information from the text by matching patterns and phrases. You will use morphological features, POS-tags, regex, and other spaCy features to form pattern objects to feed to the spaCy Matcher objects.

Chapter 5, Working with Word Vectors and Semantic Similarity, teaches you about word vectors and associated semantic similarity methods. This chapter includes word vector computations such as distance calculations, analogy calculations, and visualization.

Chapter 6, Putting Everything Together: Semantic Parsing with spaCy, is a fully hands-on chapter. This chapter teaches you how to design a ticket reservation system NLU for Airline Travel Information System (ATIS), a well-known airplane ticket reservation system dataset, with spaCy.

Chapter 7, Customizing spaCy Models, teaches you how to train, store, and use custom statistical pipeline components. You will learn how to update an existing statistical pipeline component with your own data as well as how to create a statistical pipeline component from scratch with your own data and labels.

Chapter 8, Text Classification with spaCy, teaches you how to do a very basic and popular task of NLP: text classification. This chapter explores text classification with spaCy's Textcategorizer component as well as text classification with TensorFlow and Keras.

Chapter 9, spaCy and Transformers, explores the latest hot topic in NLP – transformers – and how to use them with TensorFlow and spaCy. You'll learn how to work with BERT and TensorFlow as well as transformer-based pretrained pipelines of spaCy v3.0.

Chapter 10, Putting Everything Together: Designing Your Chatbot with spaCy, takes you into the world of Conversational AI. You will do entity extraction, intent recognition, and context handling on a real-world restaurant reservation dataset.