Chapter 5

Recurrent Neural Networks

Feed-forward networks are stateless — they treat every observation independently. Language requires memory. This chapter introduces RNNs as the first architecture that carries information forward through time, maps out the four sequence architecture types (seq-to-seq, seq-to-vector, vector-to-seq, encoder-decoder), explains the vanishing gradient problem that limits plain RNNs, and shows how LSTMs and GRUs solve it through gated memory.

1. Feed-Forward Networks Are Forgetful→
2. Recurrent Neural Network (RNN)→
3. Backpropagation Through Time and the Vanishing Gradient→
4. Long Short-Term Memory (LSTM)→
5. Gated Recurrent Unit (GRU)→
6. Limitations and Opportunities→