What Are Transformers? | Transformers for Natural Language Processing and Computer Vision

Book Overview & Buying
Table Of Contents

Transformers for Natural Language Processing and Computer Vision - Third Edition

By : Denis Rothman

4.2 (35)

Buy this Book

Transformers for Natural Language Processing and Computer Vision

4.2 (35)

By: Denis Rothman

Buy this Book

Overview of this book

Transformers for Natural Language Processing and Computer Vision, Third Edition, explores Large Language Model (LLM) architectures, practical applications, and popular platforms (Hugging Face, OpenAI, and Google Vertex AI) used for Natural Language Processing (NLP) and Computer Vision (CV). The book guides you through a range of transformer architectures from foundation models and generative AI. You’ll pretrain and fine-tune LLMs and work through different use cases, from summarization to question-answering systems leveraging embedding-based search. You'll also implement Retrieval Augmented Generation (RAG) to enhance accuracy and gain greater control over your LLM outputs. Additionally, you’ll understand common LLM risks, such as hallucinations, memorization, and privacy issues, and implement mitigation strategies using moderation models alongside rule-based systems and knowledge integration. Dive into generative vision transformers and multimodal architectures, and build practical applications, such as image and video classification. Go further and combine different models and platforms to build AI solutions and explore AI agent capabilities. This book provides you with an understanding of transformer architectures, including strategies for pretraining, fine-tuning, and LLM best practices.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Making the Most Out of This Book – Get to Know Your Free Benefits

Unlock Your Book’s Exclusive Benefits

How to unlock these benefits in three easy steps

Need help?

What Are Transformers?

Foundation Models

A brief history of how transformers were born

The new role of AI professionals

The rise of seamless transformer APIs

Summary

Questions

References

Further reading

Free Chapter

Getting Started with the Architecture of the Transformer Model

The rise of the Transformer: Attention Is All You Need

Training and performance

Hugging Face transformer models

Summary

Questions

References

Further reading

Emergent vs Downstream Tasks: The Unseen Depths of Transformers

The paradigm shift: What is an NLP task?

Investigating the potential of downstream tasks

Running downstream tasks

Summary

Questions

References

Further reading

Advancements in Translations with Google Trax, Google Translate, and Gemini

Defining machine translation

Evaluating machine translations

Translations with Google Trax

Translation with Google Translate

Translation with Gemini

Summary

Questions

References

Further reading

Diving into Fine-Tuning through BERT

The architecture of BERT

Fine-tuning BERT

Building a Python interface to interact with the model

Summary

Questions

References

Further reading

Pretraining a Transformer from Scratch through RoBERTa

Training a tokenizer and pretraining a transformer

Building KantaiBERT from scratch

Pretraining a Generative AI customer support model on X data

Next steps

Summary

Questions

References

Further reading

The Generative AI Revolution with ChatGPT

GPTs as GPTs

The architecture of OpenAI GPT transformer models

OpenAI models as assistants

Getting started with the GPT-4 API

Retrieval Augmented Generation (RAG) with GPT-4

Summary

Questions

References

Further reading

Fine-Tuning OpenAI GPT Models

Risk management

Fine-tuning a GPT model for completion (generative)

1. Preparing the dataset

2. Fine-tuning an original model

3. Running the fine-tuned GPT model

4. Managing fine-tuned jobs and models

Before leaving

Summary

Questions

References

Further reading

Shattering the Black Box with Interpretable Tools

Transformer visualization with BertViz

Interpreting Hugging Face transformers with SHAP

Transformer visualization via dictionary learning

Other interpretable AI tools

Summary

Questions

References

Further reading

Investigating the Role of Tokenizers in Shaping Transformer Models

Matching datasets and tokenizers

Exploring sentence and WordPiece tokenizers to understand the efficiency of subword tokenizers for transformers

Summary

Questions

References

Further reading

Leveraging LLM Embeddings as an Alternative to Fine-Tuning

LLM embeddings as an alternative to fine-tuning

Fundamentals of text embedding with NLKT and Gensim

Implementing question-answering systems with embedding-based search techniques

Transfer learning with Ada embeddings

Summary

Questions

References

Further reading

Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4

Getting started with cutting-edge SRL

Entering the syntax-free world of AI

Defining SRL

SRL experiments with ChatGPT with GPT-4

Questioning the scope of SRL

Redefining SRL

From task-specific SRL to emergence with ChatGPT

Summary

Questions

References

Further reading

Summarization with T5 and ChatGPT

Designing a universal text-to-text model

The rise of text-to-text transformer models

A prefix instead of task-specific formats

The T5 model

Text summarization with T5

From text-to-text to new word predictions with OpenAI ChatGPT

Summary

Questions

References

Further reading

Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2

Architecture

Assistants

Vertex AI PaLM 2 API

Fine-tuning

Summary

Questions

References

Further reading

Guarding the Giants: Mitigating Risks in Large Language Models

The emergence of functional AGI

Cutting-edge platform installation limitations

Auto-BIG-bench

WandB

When will AI agents replicate?

Risk management

Risk mitigation tools with RLHF and RAG

Summary

Questions

References

Further reading

Beyond Text: Vision Transformers in the Dawn of Revolutionary AI

From task-agnostic models to multimodal vision transformers

ViT – Vision Transformer

CLIP

DALL-E 2 and DALL-E 3

GPT-4V, DALL-E 3, and divergent semantic association

Summary

Questions

References

Further reading

Hugging Face AutoTrain: Training Vision Models without Coding

Goal and scope of this chapter

Getting started

Uploading the dataset

Training models with AutoTrain

Deploying a model

Running our models for inference

Summary

Questions

References

Further reading

On the Road to Functional AGI with HuggingGPT and its Peers

Defining F-AGI

Installing and importing

Validation set

HuggingGPT

CustomGPT

Model Chaining with Runway Gen-2

Summary

Questions

References

Further reading

Other Books You May Enjoy

Index

Appendix A: Revolutionizing AI: The Power of Optimized Time Complexity in Transformer Models

How constant time complexity O(1) of an operation changed our lives forever

How one token sparked an AI revolution

Appendix B: Answers to the Questions

Chapter 1, What Are Transformers?

Chapter 2, Getting Started with the Architecture of the Transformer Model

Chapter 3, Emergent vs Downstream Tasks: The Unseen Depths of Transformers

Chapter 4, Advancements in Translations with Google Trax, Google Translate, and Gemini

Chapter 5, Diving into Fine-Tuning through BERT

Chapter 6, Pretraining a Transformer from Scratch through RoBERTa

Chapter 7, The Generative AI Revolution with ChatGPT

Chapter 8, Fine-Tuning OpenAI GPT Models

Chapter 9, Shattering the Black Box with Interpretable Tools

Chapter 10, Investigating the Role of Tokenizers in Shaping Transformer Models

Chapter 11, Leveraging LLM Embeddings as an Alternative to Fine-Tuning

Chapter 12, Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4

Chapter 13, Summarization with T5 and ChatGPT

Chapter 14, Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2

Chapter 15, Guarding the Giants: Mitigating Risks in Large Language Models

Chapter 16, Beyond Text: Vision Transformers in the Dawn of Revolutionary AI

Chapter 17, Transcending the Image-Text Boundary with Stable Diffusion

Chapter 18, Hugging Face AutoTrain: Training Vision Models without Coding

Chapter 19, On the Road to Functional AGI with HuggingGPT and its Peers

Chapter 20, Beyond Human-Designed Prompts with Generative Ideation

Transformers for Natural Language Processing and Computer Vision - Third Edition

By : Denis Rothman

Transformers for Natural Language Processing and Computer Vision

By: Denis Rothman

Overview of this book

Transformers for Natural Language Processing and Computer Vision, Third Edition: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access