Book Image

LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries

By : Min-Yih Hsu
Book Image

LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries

By: Min-Yih Hsu

Overview of this book

Every programmer or engineer, at some point in their career, works with compilers to optimize their applications. Compilers convert a high-level programming language into low-level machine-executable code. LLVM provides the infrastructure, reusable libraries, and tools needed for developers to build their own compilers. With LLVM’s extensive set of tooling, you can effectively generate code for different backends as well as optimize them. In this book, you’ll explore the LLVM compiler infrastructure and understand how to use it to solve different problems. You’ll start by looking at the structure and design philosophy of important components of LLVM and gradually move on to using Clang libraries to build tools that help you analyze high-level source code. As you advance, the book will show you how to process LLVM IR – a powerful way to transform and optimize the source program for various purposes. Equipped with this knowledge, you’ll be able to leverage LLVM and Clang to create a wide range of useful programming language tools, including compilers, interpreters, IDEs, and source code analyzers. By the end of this LLVM book, you’ll have developed the skills to create powerful tools using the LLVM framework to overcome different real-world challenges.
Table of Contents (18 chapters)
1
Section 1: Build System and LLVM-Specific Tooling
6
Section 2: Frontend Development
11
Section 3: "Middle-End" Development

Learning LLVM IR basics

LLVM IR is an alternative form of the program you want to optimize and compile. It is, however, structured differently from normal programming languages such as C/C++. LLVM IR is organized in a hierarchical fashion. The levels in this hierarchy – counting from the top – are Module, function, basic block, and instruction. The following diagram shows their structure:

Figure 10.1 – Hierarchy structure of LLVM IR

A module represents a translation unit – usually a source file. Each module can contain multiple functions (or global variables). Each contains a list of basic blocks where each of the basic blocks contains a list of instructions.

Quick refresher – basic block

A basic block represents a list of instructions with only one entry and one exit point. In other words, if a basic block is executed, the control flow is guaranteed to walk through every instruction in the block.

Knowing the high...