Book Image

Exploring GPT-3

By : Steve Tingiris
Book Image

Exploring GPT-3

By: Steve Tingiris

Overview of this book

Generative Pre-trained Transformer 3 (GPT-3) is a highly advanced language model from OpenAI that can generate written text that is virtually indistinguishable from text written by humans. Whether you have a technical or non-technical background, this book will help you understand and start working with GPT-3 and the OpenAI API. If you want to get hands-on with leveraging artificial intelligence for natural language processing (NLP) tasks, this easy-to-follow book will help you get started. Beginning with a high-level introduction to NLP and GPT-3, the book takes you through practical examples that show how to leverage the OpenAI API and GPT-3 for text generation, classification, and semantic search. You'll explore the capabilities of the OpenAI API and GPT-3 and find out which NLP use cases GPT-3 is best suited for. You’ll also learn how to use the API and optimize requests for the best possible results. With examples focusing on the OpenAI Playground and easy-to-follow JavaScript and Python code samples, the book illustrates the possible applications of GPT-3 in production. By the end of this book, you'll understand the best use cases for GPT-3 and how to integrate the OpenAI API in your applications for a wide array of NLP tasks.
Table of Contents (15 chapters)
1
Section 1: Understanding GPT-3 and the OpenAI API
4
Section 2: Getting Started with GPT-3
8
Section 3: Using the OpenAI API

Introducing Davinci, Babbage, Curie, and Ada

The massive dataset that is used for training GPT-3 is the primary reason why it's so powerful. However, bigger is only better when it's necessary—and more power comes at a cost. For those reasons, OpenAI provides multiple models to choose from. Today there are four primary models available, along with a model for content filtering and instruct models.

The available models or engines (as they're also referred to) are named Davinci, Babbage, Curie, and Ada. Of the four, Davinci is the largest and most capable. Davinci can perform any tasks that any other engine can perform. Babbage is the next most capable engine, which can do anything that Curie or Ada can do. Ada is the least capable engine, but the best-performing and lowest-cost engine.

When you're getting started and for initially testing new prompts, you'll usually want to begin with Davinci , then try, Ada, Babbage, or Curie to see if one of them can complete the task faster or more cost-effectively. The following is an overview of each engine and the types of tasks that might be best suited for each. However, keep in mind that you'll want to test. Even though the smaller engines might not be trained with as much data, they are all still general-purpose models.

Davinci

Davinci is the most capable model and can do anything that any other model can do, and much more—often with fewer instructions. Davinci is able to solve logic problems, determine cause and effect, understand the intent of text, produce creative content, explain character motives, and handle complex summarization tasks.

Curie

Curie tries to balance power and speed. It can do anything that Ada or Babbage can do but it's also capable of handling more complex classification tasks and more nuanced tasks like summarization, sentiment analysis, chatbot applications, and Question and Answers.

Babbage

Babbage is a bit more capable than Ada but not quite as performant. It can perform all the same tasks as Ada, but it can also handle a bit more involved classification tasks, and it's well suited for semantic search tasks that rank how well documents match a search query.

Ada

Ada is usually the fastest model and least costly. It's best for less nuanced tasks—for example, parsing text, reformatting text, and simpler classification tasks. The more context you provide Ada, the better it will likely perform.

Content filtering model

To help prevent inappropriate completions, OpenAI provides a content filtering model that is fine-tuned to recognize potentially offensive or hurtful language.

Instruct models

These are models that are built on top of the Davinci and Curie models. Instruct models are tuned to make it easier to tell the API what you want it to do. Clear instructions can often produce better results than the associated core model.

A snapshot in time

A final note to keep in mind about all of the engines is that they are all a snapshot in time, meaning the data used to train them cuts off on the date the model was built. So, GPT-3 is not working with up-to-the-minute or even up-to-the-day data—it's likely weeks or months old. OpenAI is planning to add more continuous training in the future, but today this is a consideration to keep in mind.

All of the GPT-3 models are extremely powerful and capable of generating text that is indistinguishable from human-written text. This holds tremendous potential for all kinds of potential applications. In most cases, that's a good thing. However, not all potential use cases are good.