Book Image

C++ Data Structures and Algorithms

By : Wisnu Anggoro
5 (1)
Book Image

C++ Data Structures and Algorithms

5 (1)
By: Wisnu Anggoro

Overview of this book

C++ is a general-purpose programming language which has evolved over the years and is used to develop software for many different sectors. This book will be your companion as it takes you through implementing classic data structures and algorithms to help you get up and running as a confident C++ programmer. We begin with an introduction to C++ data structures and algorithms while also covering essential language constructs. Next, we will see how to store data using linked lists, arrays, stacks, and queues. Then, we will learn how to implement different sorting algorithms, such as quick sort and heap sort. Along with these, we will dive into searching algorithms such as linear search, binary search and more. Our next mission will be to attain high performance by implementing algorithms to string datatypes and implementing hash structures in algorithm design. We'll also analyze Brute Force algorithms, Greedy algorithms, and more. By the end of the book, you'll know how to build components that are easy to understand, debug, and use in different applications.
Table of Contents (16 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Analyzing the algorithm


To create a good algorithm, we have to ensure that we have got the best performance from the algorithm. In this section, we are going to discuss the ways we can analyze the time complexity of basic functions.

Asymptotic analysis

Let's start with asymptotic analysis to find out the time complexity of the algorithms. This analysis omits the constants and the least significant parts. Suppose we have a function that will print a number from 0 to n. The following is the code for the function:

void Looping(int n)
{
    int i = 0;

    while(i < n)
    {
        cout << i << endl;
        i = i + 1;
    }
}

Now, let's calculate the time complexity of the preceding algorithm by counting each instruction of the function. We start with the first statement:

int i = 0;

The preceding statement is only executed once in the function, so its value is 1. The following is the code snippet of the rest statements in the Looping() function:

while(i < n)
{
    cout << i << endl;
    i = i + 1;
}

The comparison in the while loop is valued at 1. For simplicity, we can say that the value of the two statements inside the while loop scope is 3 since it needs 1 to print the i variable, and 2 for assignment (=) and addition (+).

However, how much of the preceding code snippet is executed depends on the value of n, so it will be (1 + 3) * n or 4n. The total instruction that has to be executed for the preceding Looping() function is 1 + 4n. Therefore, the complexity of the preceding Looping() function is:

Time Complexity(n) = 4n + 1

And here is the curve that represents its complexity:

In all curve graphs that will be represented in this book, the x axis represents Input Size (n) and the y axis represents Execution Time.

As we can see in the preceding graph, the curve is linear. However, since the time complexity also depends on the other parameters, such as hardware specification, we may have another complexity for the preceding Looping() function if we run the function on faster hardware. Let's say the time complexity becomes 2n + 0.5, so that the curve will be as follow:

As we can see, the curve is still linear for the two complexities. For this reason, we can omit a constant and the least significant parts in asymptotic analysis, so we can say that the preceding complexity is n, as found in the following notation:

Time Complexity(n) = n

Let's move on to another function. If we have the nested while loop, we will have another complexity, as we can see in the following code:

void Pairing(int n)
{
    int i = 0;

    while(i < n)
    {
        int j = 0;

        while(j < n)
        {
            cout << i << ", " << j << endl;
            j = j + 1;
        }
        i = i + 1;
    }
}

Based on the preceding Looping() function, we can say that the complexity of the inner while loop of the preceding Pairing() function is 4n + 1. We then calculate the outer while loop so it becomes 1 + (n * (1 + (4n + 1) + 2), which equals 1 + 3n + 4n2. Therefore, the complexity of the preceding code is:

Time Complexity(n) = 4n2 + 3n + 1

The curve for the preceding complexity will be as follows:

And if we run the preceding code in the slower hardware, the complexity might become twice as slow. The notation should be as follows:

Time Complexity(n) = 8n2 + 6n + 2

And the curve of the preceding notation should be as follows:

As shown previously, since the asymptotic analysis omits the constants and the least significant parts, the complexity notation will be as follows: 

Time Complexity(n) = n2

Worst, average, and best cases

In the previous section, we were able to define time complexity for our code using an asymptotic algorithm. In this section, we are going to determine a case of the implementation of an algorithm. There are three cases when implementing time complexity in an algorithm; they are worst, average, and best cases. Before we go through them, let's look at the following Search() function implementation:

int Search(int arr[], int arrSize, int x)
{
    // Iterate through arr elements
    for (int i = 0; i < arrSize; ++i)
    {
        // If x is found
        // returns index of x
        if (arr[i] == x)
            return i;
    }

    // If x is not found
    // returns -1
    return -1;
}

As we can see in the preceding Search() function, it will find an index of target element (x) from an arr array containing arrSize elements. Suppose we have the array {42, 55, 39, 71, 20, 18, 6, 84}. Here are the cases we will find:

  • Worst case analysis is a calculation of the upper bound on the running time of the algorithm. In the Search() function, the upper bound can be an element that does not appear in the arr, for instance, 60, so it has to iterate through all elements of arr and still cannot find the element.
  • Average case analysis is a calculation of all possible inputs on the running time of algorithm, including an element that is not present in the arr array.
  • Best case analysis is a calculation of the lower bound on the running time of the algorithm. In the Search() function, the lower bound is the first element of the arr array, which is 42. When we search for element 42, the function will only iterate through the arr array once, so the arrSize doesn't matter.

Big Theta, Big-O, and Big Omega

After discussing asymptotic analysis and the three cases in algorithms, let's discuss asymptotic notation to represent the time complexity of an algorithm. There are three asymptotic notations that are mostly used in an algorithm; they are Big Theta, Big-O, and Big Omega.

The Big Theta notation (θ) is a notation that bounds a function from above and below, like we saw previously in asymptotic analysis, which also omits a constant from a notation. 

Suppose we have a function with time complexity 4n + 1. Since it's a linear function, we can notate it like in the following code:

f(n) = 4n + 1

Now, suppose we have a function, g(n), where f(n) is the Big-Theta of g(n) if the value, f(n), is always between c1*g(n) (lower bound) and c2*g(n) (upper bound). Since f(n) has a constant of 4 in the n variable, we will take a random lower bound which is lower than 4, that is 3, and an upper bound which is greater than 4, that is 5. Please see the following curve for reference:

From the f(n) time complexity, we can get the asymptotic complexity n, so then we have g(n) = n, which is based on the asymptotic complexity of 4n + 1. Now, we can decide the upper bound and lower bound for g(n) = n. Let's pick 3 for the lower bound and 5 for the upper bound. Now, we can manipulate the g(n) = n function as follows:

3.g(n) = 3.n
5.g(n) = 5.n

Big-O notation (O) is a notation that bounds a function from above only using the upper bound of an algorithm. From the previous notation, f(n) = 4n + 1, we can say that the time complexity of the f(n) is O(n). If we are going to use Big Theta notation, we can say that the worst case time complexity of f(n) is θ(n) and the best case time complexity of f(n) is θ(1).

Big Omega notation is contrary to Big-O notation. It uses the lower bound to analyze time complexity. In other words, it uses the best case in Big Theta notation. For the f(n) notation, we can say that the time complexity is Ω(1)

Recursive method

In the previous code sample, we calculated the complexity of the iterative method. Now, let's do this with the recursive method. Suppose we need to calculate the factorial of a certain number, for instance 6, which will produce 6 * 5 * 4 * 3 * 2 * 1 = 720. For this purpose, we can use the recursive method, which is shown in the following code snippet:

int Factorial(int n)
{
    if(n == 1)
        return 1;

    return n * Factorial(n - 1);
}

For the preceding function, we can calculate the complexity similarly to how we did for the iterative method, which is f(n) = n since it depends on how much data is being processed (which is n). We can use a constant, for instance c, to calculate the lower bound and the upper bound.

Amortized analysis

In the previous section, we just discussed the single input, n, for calculating the complexity. However, sometimes we will deal with more than just one input. Please look at the following SumOfDivision() function implementation:

int SumOfDivision(
    int nArr[], int n, int mArr[], int m)
{
    int total = 0;

    for(int i = 0; i < n; ++i)
    {
        for(int j = 0; j < m; ++j)
        {
            total += (nArr[i] * mArr[j]);
        }
    }

    return total;
}

And that's where amortized analysis comes in. Amortized analysis calculates the complexity of performing the operation for varying inputs, for instance, when we insert some elements into several arrays. Now, the complexity doesn't only depend on the n input only, but also the m input. The complexity can be as follows:

Time Complexity(n, m) = n · m

We are going to discuss these analysis methods in more detail in the following chapters.