Book Image

TensorFlow 1.x Deep Learning Cookbook

Book Image

TensorFlow 1.x Deep Learning Cookbook

Overview of this book

Deep neural networks (DNNs) have achieved a lot of success in the field of computer vision, speech recognition, and natural language processing. This exciting recipe-based guide will take you from the realm of DNN theory to implementing them practically to solve real-life problems in the artificial intelligence domain. In this book, you will learn how to efficiently use TensorFlow, Google’s open source framework for deep learning. You will implement different deep learning networks, such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Deep Q-learning Networks (DQNs), and Generative Adversarial Networks (GANs), with easy-to-follow standalone recipes. You will learn how to use TensorFlow with Keras as the backend. You will learn how different DNNs perform on some popularly used datasets, such as MNIST, CIFAR-10, and Youtube8m. You will not only learn about the different mobile and embedded platforms supported by TensorFlow, but also how to set up cloud platforms for deep learning applications. You will also get a sneak peek at TPU architecture and how it will affect the future of DNNs. By using crisp, no-nonsense recipes, you will become an expert in implementing deep learning techniques in growing real-world applications and research areas such as reinforcement learning, GANs, and autoencoders.
Table of Contents (15 chapters)
14
TensorFlow Processing Units

All you need is attention - another example of a seq2seq RNN

In this recipe, we present the attention methodology, a state-of-the-art solution for neural network translation. The idea behind attention was introduced in 2015 in the paper, Neural Machine Translation by Jointly Learning to Align and Translate, by Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio (ICLR, 2015, https://arxiv.org/abs/1409.0473) and it consists of adding additional connections between the encoder and the decoder RNNs. Indeed, connecting the decoder only with the latest layer of the encoder imposes an information bottleneck and does not necessarily allow the passing of the information acquired by the previous encoder layers. The solution adopted with attention is illustrated in the following figure:

An example of attention model for NMT as seen in https://github.com/lmthang/thesis/blob/master/thesis...