Hands-On Meta Learning with Python

Hands-On Meta Learning with Python

By : Sudharsan Ravichandiran

Buy this Book

Hands-On Meta Learning with Python

By: Sudharsan Ravichandiran

Buy this Book

Overview of this book

Meta learning is an exciting research trend in machine learning, which enables a model to understand the learning process. Unlike other ML paradigms, with meta learning you can learn from small datasets faster. Hands-On Meta Learning with Python starts by explaining the fundamentals of meta learning and helps you understand the concept of learning to learn. You will delve into various one-shot learning algorithms, like siamese, prototypical, relation and memory-augmented networks by implementing them in TensorFlow and Keras. As you make your way through the book, you will dive into state-of-the-art meta learning algorithms such as MAML, Reptile, and CAML. You will then explore how to learn quickly with Meta-SGD and discover how you can perform unsupervised learning using meta learning with CACTUs. In the concluding chapters, you will work through recent trends in meta learning such as adversarial meta learning, task agnostic meta learning, and meta imitation learning. By the end of this book, you will be familiar with state-of-the-art meta learning algorithms and able to enable human-like cognition for your machine learning models.

Title Page

Dedication

About Packt

Contributors

Preface

Free Chapter

Introduction to Meta Learning

Meta learning

Types of meta learning

Learning to learn gradient descent by gradient descent

Optimization as a model for few-shot learning

Summary

Questions

Further reading

Face and Audio Recognition Using Siamese Networks

What are siamese networks?

Face recognition using siamese networks

Building an audio recognition model using siamese networks

Summary

Questions

Further reading

Relation and Matching Networks Using TensorFlow

Relation networks

Building relation networks using TensorFlow

Matching networks

The architecture of matching networks

Matching networks in TensorFlow

Summary

Questions

Further reading

Memory-Augmented Neural Networks

NTM

Copy tasks using NTM

Memory-augmented neural networks (MANN)

Summary

Questions

Further reading

MAML and Its Variants

MAML

Adversarial meta learning

CAML

Summary

Questions

Further reading

Meta-SGD and Reptile

Meta-SGD

Reptile

Summary

Questions

Further reading

Recent Advancements and Next Steps

Task agnostic meta learning (TAML)

Meta imitation learning

CACTUs

Learning to learn in concept space

Summary

Questions

Further reading

Assessments

Chapter 1: Introduction to Meta Learning

Chapter 2: Face and Audio Recognition Using Siamese Networks

Chapter 3: Prototypical Networks and Their Variants

Chapter 4: Relation and Matching Networks Using TensorFlow

Chapter 5: Memory-Augmented Neural Networks

Chapter 6: MAML and Its Variants

Chapter 7: Meta-SGD and Reptile Algorithms

Chapter 8: Gradient Agreement as an Optimization Objective

Chapter 9: Recent Advancements and Next Steps

Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Learning to learn gradient descent by gradient descent

Now, we will see one of the interesting meta learning algorithms called learning to learn gradient descent by gradient descent. Isn't the name kind of daunting? Well, in fact, it is one of the simplest meta learning algorithms. We know that, in meta learning, our goal is to learn the learning process. In general, how do we train our neural networks? We train our network by computing loss and minimizing the loss through gradient descent. So, we optimize our model using gradient descent. Instead of using gradient descent can we learn this optimization process automatically?

But how can we learn this? We replace our traditional optimizer (gradient descent) with the Recurrent Neural Network (RNN). But how does this work? How can we replace gradient descent with RNN? If you examine closely, what are we really doing in gradient descent? It is basically a sequence of updates from the output layer to the input layer and we store these updates in a state. So, we can use RNN and store the updates in an RNN cell.

So, the main idea of this algorithm is to replace gradient descent with RNN. But the question is how do RNNs learn? How can we optimize the RNN? For optimizing an RNN, we use gradient descent. So, in a nutshell, we are learning to perform gradient descent through an RNN and that RNN is optimized by gradient descent and that's what is meant by the name learning to learn gradient descent by gradient descent.

We call our RNN, an optimizer and our base network, an optimizee. Let's say we have a model

parameterized by some parameter

. We need to find this optimal parameter

, so that we can minimize the loss. In general, we find this optimal parameter through gradient descent, but now we use the RNN for finding this optimal parameter. So the RNN (optimizer) finds the optimal parameter and sends it to the optimizee (base network); the optimizee uses this parameter, computes the loss, and sends the loss to the RNN. Based on the loss, the RNN optimizes itself through gradient descent and updates the model parameter

Confusing? Look at the following diagram: our optimizee (base network) is optimized through our optimizer (RNN). The optimizer sends the updated parameters—that is, weights—to the optimizee and the optimizee uses these weights, calculates the loss, and sends the loss to the optimizer; based on the loss, the optimizer improves itself through gradient descent:

Let's say our base network (optimizee) is parameterized by

and our RNN (optimizer) is parameterized by

. What is the loss function of the optimizer? We know that the optimizer's role (RNN) is to reduce the loss of the optimizee (base network). So the loss of our optimizer is the average loss of the optimizee and it can be represented as follows: