In this chapter, we mainly discussed NLP models and state-of-the-art hardware accelerators. After reading this chapter, you now understand why NLP models are usually not suitable to be trained on a single GPU. You also now know basic concepts such as the structure of an RNN model, a stacked RNN model, ELMo, BERT, and GPT.
Regarding hardware, you now know about several state-of-the-art GPUs from NVIDIA and the high-bandwidth links in between.
In the next chapter, we will cover the details of model parallelism and some techniques to improve system efficiency.