Book Image

Hands-On One-shot Learning with Python

By : Shruti Jadon, Ankush Garg
Book Image

Hands-On One-shot Learning with Python

By: Shruti Jadon, Ankush Garg

Overview of this book

One-shot learning has been an active field of research for scientists trying to develop a cognitive machine that mimics human learning. With this book, you'll explore key approaches to one-shot learning, such as metrics-based, model-based, and optimization-based techniques, all with the help of practical examples. Hands-On One-shot Learning with Python will guide you through the exploration and design of deep learning models that can obtain information about an object from one or just a few training samples. The book begins with an overview of deep learning and one-shot learning and then introduces you to the different methods you can use to achieve it, such as deep learning architectures and probabilistic models. Once you've got to grips with the core principles, you'll explore real-world examples and implementations of one-shot learning using PyTorch 1.x on datasets such as Omniglot and MiniImageNet. Finally, you'll explore generative modeling-based methods and discover the key considerations for building systems that exhibit human-level intelligence. By the end of this book, you'll be well-versed with the different one- and few-shot learning methods and be able to use them to build your own deep learning models.
Table of Contents (11 chapters)
1
Section 1: One-shot Learning Introduction
3
Section 2: Deep Learning Architectures
7
Section 3: Other Methods and Conclusion

Understanding meta networks

Meta networks, as the name suggests, are a form of the model-based meta-learning approach. In usual deep-learning methods, weights of neural networks are updated by stochastic gradient descent, which takes a lot of time to train. As we know, the stochastic gradient descent approach means that we will consider each training data point for a weight update, so if our batch size is 1, this will lead to a very slow optimization of the model—in other words, a slow weights update.

Meta networks suggest a solution to the problem of slow weights by training a neural network in parallel to the original neural network to predict the parameters of an objective task. The generated weights are called fast weights. If you recall, LSTM meta-learners (see Chapter 4, Optimization-Based Methods) are also built on similar grounds to predict parameter updates of...