Book Image

Mastering Transformers

By : Savaş Yıldırım, Meysam Asgari- Chenaghlu
Book Image

Mastering Transformers

By: Savaş Yıldırım, Meysam Asgari- Chenaghlu

Overview of this book

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library. The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment. By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.
Table of Contents (16 chapters)
1
Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications
4
Section 2: Transformer Models – From Autoencoding to Autoregressive Models
10
Section 3: Advanced Topics

Summary

In this chapter, we have covered a variety of introductory topics and also got our hands dirty with the hello-world transformer application. On the other hand, this chapter plays a crucial role in terms of applying what has been learned so far to the upcoming chapters. So, what has been learned so far? We took a first small step by setting the environment and system installation. In this context, the anaconda package manager helped us to install the necessary modules for the main operating systems. We also went through language models, community-provided models, and tokenization processes. Additionally, we introduced multitask (GLUE) and cross-lingual benchmarking (XTREME), which enables these language models to become stronger and more accurate. The datasets library was introduced, which facilitates efficient access to NLP datasets provided by the community. Finally, we learned how to evaluate the computational cost of a particular model in terms of memory usage and speed...