Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Generative AI with Python and PyTorch
  • Table Of Contents Toc
Generative AI with Python and PyTorch

Generative AI with Python and PyTorch - Second Edition

By : Joseph Babcock, Raghav Bali
5 (1)
close
close
Generative AI with Python and PyTorch

Generative AI with Python and PyTorch

5 (1)
By: Joseph Babcock, Raghav Bali

Overview of this book

Become an expert in Generative AI through immersive, hands-on projects that leverage today’s most powerful models for Natural Language Processing (NLP) and computer vision. Generative AI with Python and PyTorch is your end-to-end guide to creating advanced AI applications, made easy by Raghav Bali, a seasoned data scientist with multiple patents in AI, and Joseph Babcock, a PhD and machine learning expert. Through business-tested approaches, this book simplifies complex GenAI concepts, making learning both accessible and immediately applicable. From NLP to image generation, this second edition explores practical applications and the underlying theories that power these technologies. By integrating the latest advancements in LLMs, it prepares you to design and implement powerful AI systems that transform data into actionable intelligence. You’ll build your versatile LLM toolkit by gaining expertise in GPT-4, LangChain, RLHF, LoRA, RAG, and more. You’ll also explore deep learning techniques for image generation and apply styler transfer using GANs, before advancing to implement CLIP and diffusion models. Whether you’re generating dynamic content or developing complex AI-driven solutions, this book equips you with everything you need to harness the full transformative power of Python and AI.
Table of Contents (19 chapters)
close
close
17
Other Books You May Enjoy
18
Index

Discriminative and generative modeling, and Bayes’ theorem

Now, let us consider how these rules of conditional and joint probability relate to the kinds of predictive models that we build for various machine learning applications. In most cases—such as predicting whether an email is fraudulent or the dollar amount of the future lifetime value of a customer—we are interested in the conditional probability, P(Y|X=x), where Y is the set of outcomes we are trying to model and X is the input features, and x is a particular value of the input features. For example, we are trying to calculate the probability that an email is fraudulent based on the knowledge of the set of words (the x) in the message. This approach is known as discriminative modeling15, 16, 17. Discriminative modeling attempts to learn a direct mapping between the data, X, and the outcomes, Y.

Another way to understand discriminative modeling is in the context of Bayes’ theorem18, which relates the conditional and joint probabilities of a dataset, as follows:

P(Y|X) = P(X|Y)P(Y)/P(X) = P(X, Y)/P(X)

As a side note, the theorem was published two years following the author’s death, and in a foreword, Richard Price described it as a mathematical argument for the existence of God, perhaps appropriate given that Thomas Bayes served as a Reverend during his life. In the formula for Bayes’ theorem, the expression P(X|Y)/P(X) is known as the likelihood or the supporting evidence that the observation X gives to the likelihood of observing Y, P(Y) is the prior or the plausibility of the outcome, and P(Y|X) is the posterior or the probability of the outcome given all the independent data we have observed related to the outcome thus far. Conceptually, Bayes’ theorem states that the probability of an outcome is the product of its baseline probability and the probability of the input data conditional on this outcome.

In the context of discriminative learning, we can thus see that a discriminative model directly computes the posterior; we could have a model of the likelihood or prior, but it is not required in this approach. Even though you may not have realized it, most of the models you have probably used in the machine learning toolkit are discriminative, such as:

  • Linear regression
  • Logistic regression
  • Random forests19, 20
  • Gradient-boosted decision trees (GBDTs)21
  • Support vector machines (SVMs)22

The first two (linear and logistic regression) models the outcome Y conditional on the data X using a Normal or Gaussian (linear regression) or sigmoidal (logistic regression) probability function. In contrast, the last three have no formal probability model—they compute a function (an ensemble of trees for random forests or GBDTs, or an inner product distribution for SVM) that maps X to Y, using a loss or error function to tune those estimates; given this nonparametric nature, some authors have argued that these constitute a separate class of “non-model” or “non-parametric” discriminative algorithms15.

In contrast, a generative model attempts to learn the joint distribution P(Y, X) of the labels and the input data. Recall that using the definition of joint probability:

P(X, Y) = P(X|Y)P(Y)

We can rewrite Bayes’ theorem as:

P(Y|X) = P(X, Y)/P(X)

Instead of learning a direct mapping of X to Y using P(Y|X), as in the discriminative case, our goal is to model the joint probabilities of X and Y using P(X, Y). While we can use the resulting joint distribution of X and Y to compute the posterior P(Y|X) and learn a “targeted” model, we can also use this distribution to sample new instances of the data by either jointly sampling new tuples (x, y), or sampling new data inputs using a target label Y with the expression:

P(X|Y=y) = P(X, Y)/P(Y)

Examples of generative models include:

  • Naive Bayes classifiers
  • Gaussian mixture models
  • Latent Dirichlet allocation (LDA)
  • Hidden Markov models
  • Deep Boltzmann machines
  • VAEs
  • GANs

Naive Bayes classifiers, though named as a discriminative model, utilize Bayes’ theorem to learn the joint distribution of X and Y under the assumption that the X variables are independent. Similarly, Gaussian mixture models describe the likelihood of a data point belonging to one of a group of normal distributions using the joint probability of the label and these distributions. LDA represents a document as the joint probability of a word and a set of underlying keyword lists (topics) that are used in a document. Hidden Markov models express the joint probability of a state and the next state of a piece of data, such as the weather on successive days of the week. The VAE and GAN models we cover in Chapters 3–6 also utilize joint distributions to map between complex data types—this mapping allows us to generate data from random vectors or transform one kind of data into another.

As mentioned previously, another view of generative models is that they allow us to generate samples of X if we know an outcome Y. In the first four models listed previously, this conditional probability is just a component of the model formula, with the posterior estimates still being the ultimate objective. However, in the last three examples, which are all deep neural network models, learning the conditional probability of X dependent upon a hidden or “latent” variable Z is actually the main objective, in order to generate new data samples. Using the rich structure allowed by multi-layered neural networks, these models can approximate the distribution of complex data types such as images, natural language, and sound. Also, instead of being a target value, Z is often a random number in these applications, serving merely as an input from which to generate a large space of hypothetical data points. To the extent we have a label (such as whether a generated image should be of a dog or dolphin, or the genre of a generated song), the model is P(X|Y=y, Z=z), where the label Y “controls” the generation of data that is otherwise unrestricted by the random nature of Z.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Generative AI with Python and PyTorch
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon