-
Book Overview & Buying
-
Table Of Contents
Modern Computer Architecture and Organization - Third Edition
By :
If you have ever entered a prompt into a chatbot and received a cogent, human-like response, you experienced the tip of an immense computational iceberg. Behind that seemingly simple interaction lies a worldwide network of servers, memory, accelerators, and datacenters consuming gigawatts of power to train and operate large language models.
This chapter examines the computing architectures used for advanced artificial intelligence systems, with a focus on large language models (LLMs). To set the stage, we begin with a breakdown of early LLM designs, using GPT-2 as a case study to illustrate model structure and processing requirements. The discussion then expands to the computational architectures used for both LLM training and inference. The chapter concludes with an overview of the datacenter infrastructure underpinning today's largest and most sophisticated models, emphasizing the operational resources these facilities require, including...