Book Image

LLVM Essentials

By : Mayur Pandey, Suyog Sarda, David Farago
Book Image

LLVM Essentials

By: Mayur Pandey, Suyog Sarda, David Farago

Overview of this book

LLVM is currently the point of interest for many firms, and has a very active open source community. It provides us with a compiler infrastructure that can be used to write a compiler for a language. It provides us with a set of reusable libraries that can be used to optimize code, and a target-independent code generator to generate code for different backends. It also provides us with a lot of other utility tools that can be easily integrated into compiler projects. This book details how you can use the LLVM compiler infrastructure libraries effectively, and will enable you to design your own custom compiler with LLVM in a snap. We start with the basics, where you’ll get to know all about LLVM. We then cover how you can use LLVM library calls to emit intermediate representation (IR) of simple and complex high-level language paradigms. Moving on, we show you how to implement optimizations at different levels, write an optimization pass, generate code that is independent of a target, and then map the code generated to a backend. The book also walks you through CLANG, IR to IR transformations, advanced IR block transformations, and target machines. By the end of this book, you’ll be able to easily utilize the LLVM libraries in your own projects.
Table of Contents (14 chapters)
LLVM Essentials
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface
Index

LLVM tools and using them in the command line


Until now, we have understood what LLVM IR (human readable form) is and how it can be used to represent a high-level language. Now, we will take a look at some of the tools that LLVM provides so that we can play around with this IR converting to other formats and back again to the original form. Let's take a look at these tools one by one along with examples.

  • llvm-as: This is the LLVM assembler that takes LLVM IR in assembly form (human readable) and converts it to bitcode format. Use the preceding add.ll as an example to convert it into bitcode. To know more about the LLVM Bitcode file format refer to http://llvm.org/docs/BitCodeFormat.html

    $ llvm-as add.ll –o add.bc
    

    To view the content of this bitcode file, a tool such as hexdump can be used.

    $ hexdump –c add.bc
    
  • llvm-dis: This is the LLVM disassembler. It takes a bitcode file as input and outputs the llvm assembly.

    $ llvm-dis add.bc –o add.ll
    

    If you check add.ll and compare it with the previous version, it will be the same as the previous one.

  • llvm-link: llvm-link links two or more llvm bitcode files and outputs one llvm bitcode file. To view a demo write a main.c file that calls the function in the add.c file.

    $ cat main.c
    #include<stdio.h>
    
    extern int add(int);
    
    int main() {
    int a = add(2);
    printf("%d\n",a);
    return 0;
    }
    

    Convert the C source code to LLVM bitcode format using the following command.

    $ clang -emit-llvm -c main.c
    

    Now link main.bc and add.bc to generate output.bc.

    $ llvm-link main.bc add.bc -o output.bc
    
  • lli: lli directly executes programs in LLVM bitcode format using a just-in-time compiler or interpreter, if one is available for the current architecture. lli is not like a virtual machine and cannot execute IR of different architecture and can only interpret for host architecture. Use the bitcode format file generated by llvm-link as input to lli. It will display the output on the standard output.

    $ lli output.bc
    14
    
  • llc: llc is the static compiler. It compiles LLVM inputs (assembly form/ bitcode form) into assembly language for a specified architecture. In the following example it takes the output.bc file generated by llvm-link and generates the assembly file output.s.

    $ llc output.bc –o output.s
    

    Let's look at the content of the output.s assembly, specifically the two functions of the generated code, which is very similar to what a native assembler would have generated.

    Function main:
      .type  main,@function
    main:                                   # @main
      .cfi_startproc
    # BB#0:
      pushq  %rbp
    .Ltmp0:
      .cfi_def_cfa_offset 16
    .Ltmp1:
      .cfi_offset %rbp, -16
      movq  %rsp, %rbp
    .Ltmp2:
      .cfi_def_cfa_register %rbp
      subq  $16, %rsp
      movl  $0, -4(%rbp)
      movl  $2, %edi
      callq  add
      movl  %eax, %ecx
      movl  %ecx, -8(%rbp)
      movl  $.L.str, %edi
      xorl  %eax, %eax
      movl  %ecx, %esi
      callq  printf
      xorl  %eax, %eax
      addq  $16, %rsp
      popq  %rbp
      retq
    .Lfunc_end0:
    
    
    Function: add
    add:                                    # @add
      .cfi_startproc
    # BB#0:
      pushq  %rbp
    .Ltmp3:
      .cfi_def_cfa_offset 16
    .Ltmp4:
      .cfi_offset %rbp, -16
      movq  %rsp, %rbp
    .Ltmp5:
      .cfi_def_cfa_register %rbp
      movl  %edi, -4(%rbp)
      addl  globvar(%rip), %edi
      movl  %edi, %eax
      popq  %rbp
      retq
    .Lfunc_end1:
  • opt: This is modular LLVM analyzer and optimizer. It takes the input file and runs the optimization or analysis specified on the command line. Whether it runs the analyzer or optimizer depends on the command-line option.

    opt [options] [input file name]

    When the –analyze option is provided it performs various analysis on the input. There is a set of analysis options already provided that can be specified through command line or else one can write down their own analysis pass and provide the library to that analysis pass. Some of the useful analysis passes that can be specified using the following command line arguments are:

    • basicaa: basic alias analysis

    • da: dependence analysis

    • instcount: count the various instruction types.

    • loops: information about loops

    • scalar evolution: analysis of scalar evolution

    When the –analyze option is not passed, the opt tool does the actual optimization work and tries to optimize the code depending upon the command-line options passed. Similarly to the preceding case, you can use some of the optimization passes already present or write your own pass for optimization. Some of the useful optimization passes that can be specified using the following command-line arguments are:

    • constprop: simple constant propagation.

    • dce: dead code elimination pass

    • globalopt: pass for global variable optimization

    • inline: pass for function inlining

    • instcombine: for combining redundant instructions

    • licm: loop invariant code motion

    • tailcallelim: Tail Call elimination

Note

Before going ahead we must note that all the tools mentioned in this chapter are meant for compiler writers. An end user can directly use clang for compilation of C code without converting the C code into intermediate representation

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.