Fine-Tuning BERT Models

Book Overview & Buying
Table Of Contents

Transformers for Natural Language Processing - Second Edition

By : Denis Rothman

3.8 (28)

Buy this Book

Transformers for Natural Language Processing

3.8 (28)

By: Denis Rothman

Buy this Book

Overview of this book

Transformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs? Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and platforms, while teaching you the problem-solving skills you need to tackle model weaknesses. You'll use Hugging Face to pretrain a RoBERTa model from scratch, from building the dataset to defining the data collator to training the model. If you're looking to fine-tune a pretrained model, including GPT-3, then Transformers for Natural Language Processing, 2nd Edition, shows you how with step-by-step guides. The book investigates machine translations, speech-to-text, text-to-speech, question-answering, and many more NLP tasks. It provides techniques to solve hard language problems and may even help with fake news anxiety (read chapter 13 for more details). You'll see how cutting-edge platforms, such as OpenAI, have taken transformers beyond language into computer vision tasks and code creation using DALL-E 2, ChatGPT, and GPT-4. By the end of this book, you'll know how transformers work and how to implement them and resolve issues like an AI detective.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

What are Transformers?

The ecosystem of transformers

Optimizing NLP models with transformers

What resources should we use?

Summary

Questions

References

Getting Started with the Architecture of the Transformer Model

The rise of the Transformer: Attention is All You Need

Training and performance

Tranformer models in Hugging Face

Summary

Questions

References

Fine-Tuning BERT Models

The architecture of BERT

Fine-tuning BERT

Summary

Questions

References

Pretraining a RoBERTa Model from Scratch

Training a tokenizer and pretraining a transformer

Building KantaiBERT from scratch

Next steps

Summary

Questions

References

Downstream NLP Tasks with Transformers

Transduction and the inductive inheritance of transformers

Transformer performances versus Human Baselines

Running downstream tasks

Summary

Questions

References

Machine Translation with the Transformer

Defining machine translation

Preprocessing a WMT dataset

Evaluating machine translation with BLEU

Translation with Google Translate

Translations with Trax

Summary

Questions

References

The Rise of Suprahuman Transformers with GPT-3 Engines

Suprahuman NLP with GPT-3 transformer models

The architecture of OpenAI GPT transformer models

Generic text completion with GPT-2

Training a custom GPT-2 language model

Running OpenAI GPT-3 tasks

Comparing the output of GPT-2 and GPT-3

Fine-tuning GPT-3

The role of an Industry 4.0 AI specialist

Summary

Questions

References

Applying Transformers to Legal and Financial Documents for AI Text Summarization

Designing a universal text-to-text model

Text summarization with T5

Summarization with GPT-3

Summary

Questions

References

Matching Tokenizers and Datasets

Matching datasets and tokenizers

Standard NLP tasks with specific vocabulary

Exploring the scope of GPT-3

Summary

Questions

References

Semantic Role Labeling with BERT-Based Transformers

Getting started with SRL

SRL experiments with the BERT-based model

Basic samples

Difficult samples

Questioning the scope of SRL

Summary

Questions

References

Let Your Data Do the Talking: Story, Questions, and Answers

Methodology

Method 0: Trial and error

Method 1: NER first

Method 2: SRL first

Next steps

Summary

Questions

References

Detecting Customer Emotions to Make Predictions

Getting started: Sentiment analysis transformers

The Stanford Sentiment Treebank (SST)

Predicting customer behavior with sentiment analysis

Sentiment analysis with GPT-3

Some Pragmatic I4.0 thinking before we leave

Summary

Questions

References

Analyzing Fake News with Transformers

Emotional reactions to fake news

A rational approach to fake news

Before we go

Summary

Questions

References

Interpreting Black Box Transformer Models

Transformer visualization with BertViz

LIT

Transformer visualization via dictionary learning

Exploring models we cannot access

Summary

Questions

References

From NLP to Task-Agnostic Transformer Models

Choosing a model and an ecosystem

The Reformer

DeBERTa

From Task-Agnostic Models to Vision Transformers

An expanding universe of models

Summary

Questions

References

The Emergence of Transformer-Driven Copilots

Prompt engineering

Copilots

Domain-specific GPT-3 engines

Transformer-based recommender systems

Computer vision

Humans and AI copilots in metaverses

Summary

Questions

References

The Consolidation of Suprahuman Transformers with OpenAI’s ChatGPT and GPT-4

Consolidating suprahuman NLP with ChatGPT and GPT-4 transformer models

Jump-starting the ChatGPT API

ChatGPT Plus writes and comments on a program

Getting started with the GPT-4 API

Advanced prompt engineering

Explainable AI (XAI)

Getting started with the DALL-E 2 API

Putting it all together

Summary

Questions

References

Other Books You May Enjoy

Index

Appendix I — Terminology of Transformer Models

Stack

Sublayer

Attention heads

Appendix II — Hardware Constraints for Transformer Models

The Architecture and Scale of Transformers

Why GPUs are so special

GPUs are designed for parallel computing

GPUs are also designed for matrix multiplication

Implementing GPUs in code

Testing GPUs with Google Colab

Google Colab Free with a CPU

Google Colab Pro with a GPU

Appendix III — Generic Text Completion with GPT-2

Step 1: Activating the GPU

Step 2: Cloning the OpenAI GPT-2 repository

Step 3: Installing the requirements

Step 4: Checking the version of TensorFlow

Step 5: Downloading the 345M-parameter GPT-2 model

Steps 6-7: Intermediate instructions

Steps 7b-8: Importing and defining the model

Step 9: Interacting with GPT-2

References

Appendix IV — Custom Text Completion with GPT-2

Training a GPT-2 language model

References

Appendix V — Answers to the Questions

Chapter 1, What are Transformers?

Chapter 2, Getting Started with the Architecture of the Transformer Model

Chapter 3, Fine-Tuning BERT Models

Chapter 4, Pretraining a RoBERTa Model from Scratch

Chapter 5, Downstream NLP Tasks with Transformers

Chapter 6, Machine Translation with the Transformer

Chapter 7, The Rise of Suprahuman Transformers with GPT-3 Engines

Chapter 8, Applying Transformers to Legal and Financial Documents for AI Text Summarization

Chapter 9, Matching Tokenizers and Datasets

Chapter 10, Semantic Role Labeling with BERT-Based Transformers

Chapter 11, Let Your Data Do the Talking: Story, Questions, and Answers

Chapter 12, Detecting Customer Emotions to Make Predictions

Chapter 13, Analyzing Fake News with Transformers

Chapter 14, Interpreting Black Box Transformer Models

Chapter 15, From NLP to Task-Agnostic Transformer Models

Chapter 16, The Emergence of Transformer-Driven Copilots

Chapter 17, The Consolidation of Suprahuman Transformers with OpenAI’s ChatGPT and GPT-4

Transformers for Natural Language Processing - Second Edition

By : Denis Rothman

Transformers for Natural Language Processing

By: Denis Rothman

Overview of this book

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access