-
Book Overview & Buying
-
Table Of Contents
Pretrain Vision and Large Language Models in Python
By :
The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin … The only thing that matters in the long run is the leveraging of computation.
– Richard Sutton, “The Bitter Lesson,” 2019 (1)
In this chapter, you’ll be introduced to foundation models, the backbone of many artificial intelligence and machine learning systems today. In particular, we will dive into their creation process, also called pretraining, and understand where it’s competitive to improve the accuracy of your models. We will discuss the core transformer architecture underpinning state-of-the-art models such as Stable Diffusion, BERT, Vision Transformers, OpenChatKit, CLIP, Flan-T5, and more. You will learn about the encoder and decoder frameworks, which work to solve a variety of use cases.
In this chapter, we will cover the following topics: