Book Image

C++ High Performance

By : Björn Andrist, Viktor Sehr
5 (1)
Book Image

C++ High Performance

5 (1)
By: Björn Andrist, Viktor Sehr

Overview of this book

C++ is a highly portable language and can be used to write both large-scale applications and performance-critical code. It has evolved over the last few years to become a modern and expressive language. This book will guide you through optimizing the performance of your C++ apps by allowing them to run faster and consume fewer resources on the device they're running on without compromising the readability of your code base. The book begins by helping you measure and identify bottlenecks in a C++ code base. It then moves on by teaching you how to use modern C++ constructs and techniques. You'll see how this affects the way you write code. Next, you'll see the importance of data structure optimization and memory management, and how it can be used efficiently with respect to CPU caches. After that, you'll see how STL algorithm and composable Range V3 should be used to both achieve faster execution and more readable code, followed by how to use STL containers and how to write your own specialized iterators. Moving on, you’ll get hands-on experience in making use of modern C++ metaprogramming and reflection to reduce boilerplate code as well as in working with proxy objects to perform optimizations under the hood. After that, you’ll learn concurrent programming and understand lock-free data structures. The book ends with an overview of parallel algorithms using STL execution policies, Boost Compute, and OpenCL to utilize both the CPU and the GPU.
Table of Contents (13 chapters)

C++ compared with other languages

A multitude of application types, platforms, and programming languages have emerged since C++ was first released. Still, C++ is a widely used language, and its compilers are available for most platforms. The major exception, as of today, is the web platform, where JavaScript and its related technologies are the foundation. However, the web platform is evolving into being able to execute what was previously only possible in desktop applications, and in that context C++ has found its way into web applications using technologies such as Emscripten/asm.js and web assembly.

Competing languages and performance

In order to understand how C++ achieves its performance compared to other programming languages, we'd like to discuss some fundamental differences between C++ and most other modern programming languages.

For simplicity, this section will focus on comparing C++ to Java, although the comparisons for most parts also apply to other programming language based upon a garbage collector, such as C# and JavaScript.

Firstly, Java compile to bytecode, which is then compiled to machine code while the application is executing, whereas C++ directly compiles the source code to machine code. Although bytecode and just-in-time compilers may theoretically be able to achieve the same (or theoretically, even better) performance than precompiled machine code, as of today, they simply do not. To be fair though, they perform well enough for most cases.

Secondly, Java handle dynamic memory in a completely different manner from C++. In Java, memory is automatically deallocated by a garbage collector, whereas a C++ program handles memory deallocations by itself. The garbage collector does prevent memory leaks, but at the cost of performance and predictability.

Thirdly, Java places all its objects in separate heap allocations, whereas C++ allows the programmer to place objects both on the stack and on the heap. In C++ it's also possible to create multiple objects in one single heap allocation. This can be a huge performance gain for two reasons: objects can be created without always allocating dynamic memory, and multiple related objects can be placed adjacent to one another in memory.

Take a look at how memory is allocated in the following example. The C++ function uses the stack for both objects and integers; Java places the objects on the heap:

C++ Java
class Car {
Car(int doors)
: doors_(doors) {}
int doors_{};

auto func() {
auto num_doors = 2;
auto car1 = Car{num_doors};
auto car2 = Car{num_doors};
class Car {
public Car(int doors) {
doors_ = doors;
private int doors_;

static void func() {
int numDoors = 2;
Car car1 = new Car(numDoors);
Car car2 = new Car(numDoors);

C++ places everything on the stack:

Java places the Car objects on the heap:

Now take a look at the next example and see how an array of Car objects are placed in memory when using C++ and Java respectively:



auto car_list() {
auto n = 7;
auto cars =
for(auto i=0; i<n; ++i){
void carList() {
int n = 7;
ArrayList<Car> cars =
new ArrayList<Car>();
for(int i=0; i<n; i++) {
cars.addElement(new Car());

The following image shows how the car objects are laid out in memory in C++:

The following image shows how the car objects are laid out in memory in Java:

The C++ vector contains the actual Car objects placed in one contiguous memory block, whereas the equivalent in Java is a contiguous memory block of references to Car objects. In Java, the objects has been allocated separately, which means that they can be located anywhere in the heap.

This affects the performance as Java has to execute seven allocations instead of one. It also means that whenever the application iterates the list, there is a performance win for C++, since accessing nearby memory locations is faster than accessing several random spots in memory.

Non-performance-related C++ language features

In some discussions about C++ versus other languages, it's concluded that C++ should only be used if performance is a major concern. Otherwise, it's said to just increase the complexity of the code base due to manual memory handling, which may result in memory leaks and hard-to-track bugs.

This may have been true several C++ versions ago, but a modern C++ programmer relies on the provided containers and smart pointer types, which are part of the STL.

We would here like to highlight two powerful features of C++ related to robustness rather than performance, that we think are easily overlooked: value semantics and const correctness.

Value semantics

C++ supports both value semantics and reference semantics. Value semantics lets us pass objects by value instead of just passing references to objects. In C++, value semantics is the default, which means that when you pass an instance of a class or struct, it behaves in the same way as passing an int, float, or any other fundamental type. To use reference semantics, we need to explicitly use references or pointers.

The C++ type system gives us the ability to explicitly state the ownership of an object. Compare the following implementations of a simple class in C++ and Java. We start with the C++ version:

// C++
class Bagel {
Bagel(const std::set<std::string>& ts) : toppings_(ts) {}
std::set<std::string> toppings_;

The corresponding implementation in Java could look like this:

// Java
class Bagel {
public Bagel(ArrayList<String> ts) { toppings_ = ts; }
private ArrayList<String> toppings_;

In the C++ version, the programmer states that the toppings are completely encapsulated by the Bagel class. Had the programmer intended the topping list to be shared among several bagels, it would have been declared as a pointer of some kind: std::shared_ptr, if the ownership is shared among several bagels, or a std::weak_ptr, if someone else owns the topping list and is supposed to modify it as the program executes.

In Java, objects references each other with shared ownership. Therefore, it's not possible to distinguish whether the topping list is intended to be shared among several bagels or not, or whether it is handled somewhere else or if it is, as in most cases, completely owned by the Bagel class.

Compare the following functions; as every object is shared by default in Java (and most other languages), programmers have to take precautions for subtle bugs such as this:



Note how the bagels do not share toppings:

auto t = std::set<std::string>{};
auto a = Bagel{t};

// 'a' is not affected
// when adding pepper

// 'a' will have salt
// 'b' will have salt & pepper
auto b = Bagel{t};

// No bagel is affected

Note how both the bagels subtly share toppings:

TreeSet<String> t = new
Bagel a = new Bagel(t);

// Now 'a' will subtly
// also have pepper

// 'a' and 'b' share the
// toppings in 't'
Bagel b = new Bagel(t);

// Both bagels subtly
// also have "oregano"

Const correctness

Another powerful feature of C++, that Java and many other languages lack, is the ability to write const correct code. Const correctness means that each member function signature of a class explicitly tells the caller whether the object will be modified or not; and it will not compile if the caller tries to modify an object declared const.

Here follows an example of how we can use const member functions to prevent unintentional modifications of objects. In the following Person class, the member function age() is declared const and is therefore not allowed to mutate the Person object; whereas set_age() mutates the object and cannot be declared const:

class Person {
auto age() const { return age_; }
auto set_age(int age) { age_ = age; }
int age_{};

It's also possible to distinguish between returning mutable and immutable references to members. In the following Team class, the member function leader() const returns an immutable Person; whereas leader() returns a Person object that may be mutated:

class Team {
auto& leader() const { return leader_; }
auto& leader() { return leader_; }
Person leader_{};

Now let's see how the compiler can help us find errors when we try to mutate immutable objects. In the following example, the function argument teams is declared const, explicitly showing that this function is not allowed to modify them:

auto nonmutating_func(const std::vector<Team>& teams) {
auto tot_age = int{0};

// Compiles, both leader() and age() are declared const
for (const auto& team: teams)
tot_age += team.leader().age();

// Will not compile, set_age() requires a mutable object
for (auto& team: teams)

If we want to write a function which can mutate the teams object we simply remove const. This signals to the caller that this function may mutate the teams:

auto mutating_func(std::vector<Team>& teams) {
auto tot_age = int{0};

// Compiles, const functions can be called on mutable objects
for (const auto& team: teams)
tot_age += team.leader().age();

// Compiles, teams is a mutable variable
for (auto& team: teams)

Object ownership and garbage collection in C++

Except in very rare situations, a C++ programmer should leave the memory handling to containers and smart pointers and never have to rely on manual memory handling.

To put it clearly, the garbage collection model in Java could almost be emulated in C++ by using std::shared_ptr for every object. Note that garbage-collecting languages don't use the same algorithm for allocation tracking as std::shared_ptr. The std::shared_ptr is a smart pointer based on a reference-counting algorithm that will leak memory if objects have cyclic dependencies. Garbage-collecting languages have more sophisticated methods that can handle and free cyclic dependent objects.

However, rather than relying on a garbage collector, forcing a strict ownership delicately avoids subtle bugs that may result from sharing objects by default, as in the case of Java.

If a programmer minimize shared ownership in C++, the resulting code is easier to use and harder to abuse, as it can force the user of the class to use it as it is intended.

Avoiding null objects using C++ references

In addition to strict ownership, C++ also has the concept of references, which is different from references in Java. Internally, a reference is a pointer which is not allowed to be null or repointed; therefore no copying is involved when passing it to a function.

As a result, a function signature in C++ can explicitly restrict the programmer from passing a null object as a parameter. In Java the programmer must use documentation or annotations to indicate non-null parameters.

Take a look at these two Java functions for computing the volume of a sphere. The first one throws a runtime exception if a null object is passed to it; whereas the second one silently ignores null objects.

This first implementation in Java throws a runtime exception if passed a null object:

// Java
float getVolume1(Sphere s) {
float cube = Math.pow(s.radius(), 3);
return (Math.PI * 4 / 3) * cube;

This second implementation in Java silently handles null objects:

// Java
float getVolume2(Sphere s) {
float rad = a == null ? 0.0f : s.radius();
float cube = Math.pow(rad, 3);
return (Math.PI * 4 / 3) * cube;

In both function implemented in Java, the caller of the function has to inspect the implementation of the function in order to determine whether null objects are allowed or not.

In C++, the first function signature explicitly accepts only initialized objects by using references which cannot be null. The second version using pointers as arguments, explicitly shows that null objects are handled.

C++ arguments passed as references indicates that null values are not allowed:

auto get_volume1(const Sphere& s) {   
auto cube = std::pow(s.radius(), 3);
auto pi = 3.14f;
return (pi * 4 / 3) * cube;

C++ arguments passed as pointers indicates that null values are being handled:

auto get_volume2(const Sphere* s) {
auto rad = s ? s->radius() : 0.0f;
auto cube = std::pow(rad, 3);
auto pi = 3.14f;
return (pi * 4 / 3) * cube;

Being able to use references or values as arguments in C++ instantly informs the C++ programmer how the function is intended to be used. Conversely, in Java, the user must inspect the implementation of the function, as objects are always passed as pointers, and there's a possibility that they could be null.

Drawbacks of C++

Comparing C++ with other programming languages wouldn't be fair without mentioning some of its drawbacks. As mentioned earlier, C++ has more concepts to learn, and is therefore harder to use correctly and to its full potential. However, if a programmer can master C++, the higher complexity turns into an advantage and the code base becomes more robust and performs better.

There are, nonetheless, some shortcomings of C++, which are simply just shortcomings. The most severe of those shortcomings are long compilation times, the reliance on the manual handling of forward declarations, header/source files, and the complexity of importing libraries.

This is mainly a result of C++ relying on an outdated import system where imported headers are simply pasted into whatever includes them. At the time of writing this book, a modern module-based import system is up for standardization, but until the standardized C++ version becomes available, project management remains very tedious.

Another apparent drawback of C++ is the lack of provided libraries. While other languages usually come with all the libraries needed for most applications, such as graphics, user interfaces, networking, threading, resource handling, and so on, C++ provides, more or less, nothing more than the bare minimum of algorithms, threads, and, as of C++17, file system handling. For everything else, programmers have to rely on external libraries.

To summarize, although C++ has a steeper learning curve than most other languages, if used correctly, the robustness of C++ is an advantage compared to many other languages. So, despite the outdated import/library system of C++, we believe that C++ is a well suited language for large-scale projects, even for projects where performance is not the highest priority.