Chapter 3: Representing Text – Capturing Semantics | Python Natural Language Processing Cookbook

Book Overview & Buying
Table Of Contents

Python Natural Language Processing Cookbook - Second Edition

By : Zhenya Antić, Saurabh Chakravarty

5 (5)

Buy this Book

Python Natural Language Processing Cookbook

5 (5)

By: Zhenya Antić, Saurabh Chakravarty

Buy this Book

Overview of this book

Harness the power of Natural Language Processing (NLP) to overcome real-world text analysis challenges with this recipe-based roadmap written by two seasoned NLP experts with vast experience transforming various industries with their NLP prowess. You’ll be able to make the most of the latest NLP advancements, including large language models (LLMs), and leverage their capabilities through Hugging Face transformers. Through a series of hands-on recipes, you’ll master essential techniques such as extracting entities and visualizing text data. The authors will expertly guide you through building pipelines for sentiment analysis, topic modeling, and question-answering using popular libraries like spaCy, Gensim, and NLTK. You’ll also learn to implement RAG pipelines to draw out precise answers from a text corpus using LLMs. This second edition expands your skillset with new chapters on cutting-edge LLMs like GPT-4, Natural Language Understanding (NLU), and Explainable AI (XAI)—fostering trust in your NLP models. By the end of this book, you'll be equipped with the skills to apply advanced text processing techniques, use pre-trained transformer models, build custom NLP pipelines to extract valuable insights from text data to drive informed decision-making.

Preface

Who this book is for

What this book covers

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Chapter 1: Learning NLP Basics

Technical requirements

Dividing text into sentences

Dividing sentences into words – tokenization

Part of speech tagging

Combining similar words – lemmatization

Removing stopwords

Free Chapter

Chapter 2: Playing with Grammar

Technical requirements

Counting nouns – plural and singular nouns

Getting the dependency parse

Extracting noun chunks

Extracting subjects and objects of the sentence

Finding patterns in text using grammatical information

Chapter 3: Representing Text – Capturing Semantics

Technical requirements

Creating a simple classifier

Putting documents into a bag of words

Constructing an N-gram model

Representing texts with TF-IDF

Using word embeddings

Training your own embeddings model

Using BERT and OpenAI embeddings instead of  word embeddings

Retrieval augmented generation (RAG)

Chapter 4: Classifying Texts

Technical requirements

Getting the dataset and evaluation ready

Performing rule-based text classification using keywords

Clustering sentences using K-Means – unsupervised  text classification

Using SVMs for supervised text classification

Training a spaCy model for supervised text classification

Classifying texts using OpenAI models

Chapter 5: Getting Started with Information Extraction

Technical requirements

Using regular expressions

Finding similar strings – Levenshtein distance

Extracting keywords

Performing named entity recognition using spaCy

Training your own NER model with spaCy

Fine-tuning BERT for NER

Chapter 6: Topic Modeling

Technical requirements

LDA topic modeling with gensim

Community detection clustering with SBERT

K-Means topic modeling with BERT

Topic modeling using BERTopic

Using contextualized topic models

Chapter 7: Visualizing Text Data

Technical requirements

Visualizing the dependency parse

Visualizing parts of speech

Visualizing NER

Creating a confusion matrix plot

Constructing word clouds

Visualizing topics from Gensim

Visualizing topics from BERTopic

Chapter 8: Transformers and Their Applications

Technical requirements

Loading a dataset

Tokenizing the text in your dataset

Classifying text

Using a zero-shot classifier

Generating text

Language translation

Chapter 9: Natural Language Understanding

Technical requirements

Answering questions from a short text passage

Answering questions from a long text passage

Answering questions from a document corpus in an extractive manner

Answering questions from a document corpus in an abstractive manner

Summarizing text using pre-trained models based on Transformers

Detecting sentence entailment

Enhancing explainability via a classifier-invariant approach

Enhancing explainability via text generation

Chapter 10: Generative AI and Large Language Models

Technical requirements

Running an LLM locally

Running an LLM to follow instructions

Augmenting an LLM with external data

Creating a chatbot using an LLM

Generating code using an LLM

Generating a SQL query using human-defined requirements

Agents – making an LLM to reason and act

Using OpenAI models instead of local ones

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Python Natural Language Processing Cookbook - Second Edition

By : Zhenya Antić, Saurabh Chakravarty

Python Natural Language Processing Cookbook

By: Zhenya Antić, Saurabh Chakravarty

Overview of this book

Using word embeddings

Getting ready

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access