Book Image

50 Algorithms Every Programmer Should Know - Second Edition

By : Imran Ahmad
4 (5)
Book Image

50 Algorithms Every Programmer Should Know - Second Edition

4 (5)
By: Imran Ahmad

Overview of this book

The ability to use algorithms to solve real-world problems is a must-have skill for any developer or programmer. This book will help you not only to develop the skills to select and use an algorithm to tackle problems in the real world but also to understand how it works. You'll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, with the help of practical examples. As you advance, you'll learn about linear programming, page ranking, and graphs, and will then work with machine learning algorithms to understand the math and logic behind them. Case studies will show you how to apply these algorithms optimally before you focus on deep learning algorithms and learn about different types of deep learning models along with their practical use. You will also learn about modern sequential models and their variants, algorithms, methodologies, and architectures that are used to implement Large Language Models (LLMs) such as ChatGPT. Finally, you'll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks. By the end of this programming book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.
Table of Contents (22 chapters)
Free Chapter
1
Section 1: Fundamentals and Core Algorithms
7
Section 2: Machine Learning Algorithms
14
Section 3: Advanced Topics
20
Other Books You May Enjoy
21
Index

GRU

GRUs represent an evolution of the basic RNN structure, specifically designed to address some of the challenges encountered with traditional RNNs, such as the vanishing gradient problem. The architecture of a GRU is illustrated in Figure 10.8:

Figure 10.11: GRU

Let us start discussing GRU with the first activation function, annotated as A. At each timestep t, GRU first calculates the hidden state using the tanh activation function and utilizing and as inputs. The calculation is no different than how the hidden state is determined in the original RNNs presented in the previous section. But there is an important difference. The output is a candidate hidden state, which is calculated using Eq. 10.6:

where is the candidate value of the hidden layer.

Now, instead of using the candidate hidden state straight away, the GRU takes a moment to decide whether to use it. Imagine it like someone pausing to think before making a decision. This pause-and-think step...