Book Image

Applied Deep Learning with Keras

By : Ritesh Bhagwat, Mahla Abdolahnejad, Matthew Moocarme
Book Image

Applied Deep Learning with Keras

By: Ritesh Bhagwat, Mahla Abdolahnejad, Matthew Moocarme

Overview of this book

Though designing neural networks is a sought-after skill, it is not easy to master. With Keras, you can apply complex machine learning algorithms with minimum code. Applied Deep Learning with Keras starts by taking you through the basics of machine learning and Python all the way to gaining an in-depth understanding of applying Keras to develop efficient deep learning solutions. To help you grasp the difference between machine and deep learning, the book guides you on how to build a logistic regression model, first with scikit-learn and then with Keras. You will delve into Keras and its many models by creating prediction models for various real-world scenarios, such as disease prediction and customer churning. You’ll gain knowledge on how to evaluate, optimize, and improve your models to achieve maximum information. Next, you’ll learn to evaluate your model by cross-validating it using Keras Wrapper and scikit-learn. Following this, you’ll proceed to understand how to apply L1, L2, and dropout regularization techniques to improve the accuracy of your model. To help maintain accuracy, you’ll get to grips with applying techniques including null accuracy, precision, and AUC-ROC score techniques for fine tuning your model. By the end of this book, you will have the skills you need to use Keras when building high-level deep neural networks.
Table of Contents (12 chapters)
Applied Deep Learning with Keras
Preface
Preface

Long Short-Term Memory (LSTM)


LSTMs are RNNs whose main objective is to overcome the shortcomings of the vanishing gradient and exploding gradient problem. The architecture is built such that they remember data and information for a long period of time.

LSTMs were designed to overcome the limitation of the vanishing and exploding gradient problems. LSTM networks are a special kind of RNN, which are capable of learning long-term dependencies. They are designed to avoid the long-term dependency problem; being able to remember information for long intervals of time is how they are wired. The following diagram displays a standard recurrent network where the repeating module has a tanh activation function. This is a simple RNN; in this architecture, we often have to face the vanishing gradient problem:

Figure 9.12: A simple RNN model

LSTM architecture is similar to simple RNNs but their repeating module has different components, as shown in the following diagram:

Figure 9.13: The LSTM model architecture...