Summary
In this chapter, you learned about LSTM networks. First, we discussed what an LSTM is and its high-level architecture. We also delved into the detailed computations that take place in an LSTM and discussed the computations through an example.
We saw that LSTM is composed mainly of five different things:
Cell state: The internal cell state of an LSTM cell
Hidden state: The external hidden state used to calculate predictions
Input gate: This determines how much of the current input is read into the cell state
Forget gate: This determines how much of the previous cell state is sent into the current cell state
Output gate: This determines how much of the cell state is output into the hidden state
Having such a complex structure allows LSTMs to capture both short-term and long-term dependencies quite well.
We compared LSTMs to vanilla RNNs and saw that LSTMs are actually capable of learning long-term dependencies as an inherent part of their structure, whereas RNNs can fail to learn long...