Book Image

The Deep Learning Workshop

By : Mirza Rahim Baig, Thomas V. Joseph, Nipun Sadvilkar, Mohan Kumar Silaparasetty, Anthony So
Book Image

The Deep Learning Workshop

By: Mirza Rahim Baig, Thomas V. Joseph, Nipun Sadvilkar, Mohan Kumar Silaparasetty, Anthony So

Overview of this book

Are you fascinated by how deep learning powers intelligent applications such as self-driving cars, virtual assistants, facial recognition devices, and chatbots to process data and solve complex problems? Whether you are familiar with machine learning or are new to this domain, The Deep Learning Workshop will make it easy for you to understand deep learning with the help of interesting examples and exercises throughout. The book starts by highlighting the relationship between deep learning, machine learning, and artificial intelligence and helps you get comfortable with the TensorFlow 2.0 programming structure using hands-on exercises. You’ll understand neural networks, the structure of a perceptron, and how to use TensorFlow to create and train models. The book will then let you explore the fundamentals of computer vision by performing image recognition exercises with convolutional neural networks (CNNs) using Keras. As you advance, you’ll be able to make your model more powerful by implementing text embedding and sequencing the data using popular deep learning solutions. Finally, you’ll get to grips with bidirectional recurrent neural networks (RNNs) and build generative adversarial networks (GANs) for image synthesis. By the end of this deep learning book, you’ll have learned the skills essential for building deep learning models with TensorFlow and Keras.
Table of Contents (9 chapters)
Preface

Summary

In this chapter, we started by understanding the reasons for plain RNNs not being practical for very large sequences – the main culprit being the vanishing gradient problem, which makes modeling long-range dependencies impractical. We saw the LSTM as an update that performs extremely well for long sequences, but it is rather complicated and has a large number of parameters. GRU is an excellent alternative that is a simplification over LSTM and works well on smaller datasets.

Then, we started looking at ways to extract more power from these RNNs by using bidirectional RNNs and stacked layers of RNNs. We also discussed attention mechanisms, a significant new approach that provides state-of-the-art results in translation but can also be employed on other sequence-processing tasks. All of these are extremely powerful models that have changed the way several tasks are performed and form the basis for models that produce state-of-the-art results. With active research in...