Book Image

Hands-On One-shot Learning with Python

By : Shruti Jadon, Ankush Garg
Book Image

Hands-On One-shot Learning with Python

By: Shruti Jadon, Ankush Garg

Overview of this book

One-shot learning has been an active field of research for scientists trying to develop a cognitive machine that mimics human learning. With this book, you'll explore key approaches to one-shot learning, such as metrics-based, model-based, and optimization-based techniques, all with the help of practical examples. Hands-On One-shot Learning with Python will guide you through the exploration and design of deep learning models that can obtain information about an object from one or just a few training samples. The book begins with an overview of deep learning and one-shot learning and then introduces you to the different methods you can use to achieve it, such as deep learning architectures and probabilistic models. Once you've got to grips with the core principles, you'll explore real-world examples and implementations of one-shot learning using PyTorch 1.x on datasets such as Omniglot and MiniImageNet. Finally, you'll explore generative modeling-based methods and discover the key considerations for building systems that exhibit human-level intelligence. By the end of this book, you'll be well-versed with the different one- and few-shot learning methods and be able to use them to build your own deep learning models.
Table of Contents (11 chapters)
1
Section 1: One-shot Learning Introduction
3
Section 2: Deep Learning Architectures
7
Section 3: Other Methods and Conclusion

Overview of Bayesian learning

In this section, we will briefly discuss the idea behind Bayesian learning from a mathematical perspective, which is the core of the probabilistic models for one-shot learning. The overall goal of Bayesian learning is to model the distribution of the parameters, , given the training data, that is, to learn the distribution, .

In the probabilistic view of machine learning, we try to solve the following equation:

In this setting, we try to find the best set of parameters, theta (), that would explain the data. Consequently, we maximize the given equation over :

We can take the logarithm on both sides, which would not affect the optimization problem but makes the math easy and tractable:

We can drop the P(data) from the right side of the data as it is not dependent on θ for the optimization problem, and consequently, the optimization problem...