Book Image

Hands-On Meta Learning with Python

By : Sudharsan Ravichandiran
Book Image

Hands-On Meta Learning with Python

By: Sudharsan Ravichandiran

Overview of this book

Meta learning is an exciting research trend in machine learning, which enables a model to understand the learning process. Unlike other ML paradigms, with meta learning you can learn from small datasets faster. Hands-On Meta Learning with Python starts by explaining the fundamentals of meta learning and helps you understand the concept of learning to learn. You will delve into various one-shot learning algorithms, like siamese, prototypical, relation and memory-augmented networks by implementing them in TensorFlow and Keras. As you make your way through the book, you will dive into state-of-the-art meta learning algorithms such as MAML, Reptile, and CAML. You will then explore how to learn quickly with Meta-SGD and discover how you can perform unsupervised learning using meta learning with CACTUs. In the concluding chapters, you will work through recent trends in meta learning such as adversarial meta learning, task agnostic meta learning, and meta imitation learning. By the end of this book, you will be familiar with state-of-the-art meta learning algorithms and able to enable human-like cognition for your machine learning models.
Table of Contents (17 chapters)
Title Page
Dedication
About Packt
Contributors
Preface
Index

Meta-SGD


Let's say we have some task, T. We use a model,

, parameterized by some parameter,

, and train the model to minimize the loss. We minimize the loss using gradient descent and find the optimal parameter

for the model.

Let's recall the update rule of a gradient descent:

So, what are the key elements that make up our gradient descent? Let's see:

  • Parameter
  • Learning rate
  • Update direction

We usually set the parameter

to some random value and try to find the optimal value during our training process, and we set the value of learning rate

to a small number or decay it over time and an update direction that follows the gradient. Can we learn all of these key elements of the gradient descent by meta learning so that we can learn quickly from a few data points? We've already seen, in the last chapter, how MAML finds the optimal initial parameter

that's generalizable across tasks. With the optimal initial parameter, we can take fewer gradient steps and learn quickly on a new task.

So, now can...