-
Book Overview & Buying
-
Table Of Contents
GPU-Accelerated Computing with Python 3 and CUDA
By :
In the previous chapter, we demonstrated how GPU-accelerated computing can be used for molecular simulations. Now, let's shift our focus to a more tangible topic: generating human-like text using a Large Language Model (LLM). In this chapter, we will build our own language model using JAX to explore how this technology works. We will make a small Generative Pretrained Transformer (GPT) model for text generation tasks and train it on a lightweight example dataset. Along the way, we will cover topics such as attention mechanisms, transformer architecture, embeddings, tokenization, and autoregressive text generation step by step. The principles and techniques learned through these exercises can be applied to larger and more complex models. To keep this chapter manageable, we'll use two additional libraries, Flax and Optax, built on top of JAX, to facilitate creating neural network layers and using more advanced optimization...