Book Image

Getting started with LLVM core libraries

Book Image

Getting started with LLVM core libraries

Overview of this book

Table of Contents (17 chapters)
Getting Started with LLVM Core Libraries
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Learning the frontend phases with Clang


To transform a source code program into LLVM IR bitcode, there are a few intermediate steps the source code must pass through. The following figure illustrates all of them, and they are the topics of this section:

Lexical analysis

The very first frontend step processes the source code's textual input by splitting language constructs into a set of words and tokens, removing characters such as comments, white spaces, and tabs. Each word or token must be part of the language subset, and reserved language keywords are converted into internal compiler representations. The reserved words are defined in include/clang/Basic/TokenKinds.def. For example, see the definition of the while reserved word and the < symbol, two known C/C++ tokens, highlighted in the TokenKinds.def excerpt here:

TOK(identifier)          // abcde123
// C++11 String Literals.
TOK(utf32_string_literal) // U"foo"
…
PUNCTUATOR(r_paren,             ")")
PUNCTUATOR(l_brace,             "{...