Book Image

C++ Data Structures and Algorithms

By : Wisnu Anggoro
5 (1)
Book Image

C++ Data Structures and Algorithms

5 (1)
By: Wisnu Anggoro

Overview of this book

C++ is a general-purpose programming language which has evolved over the years and is used to develop software for many different sectors. This book will be your companion as it takes you through implementing classic data structures and algorithms to help you get up and running as a confident C++ programmer. We begin with an introduction to C++ data structures and algorithms while also covering essential language constructs. Next, we will see how to store data using linked lists, arrays, stacks, and queues. Then, we will learn how to implement different sorting algorithms, such as quick sort and heap sort. Along with these, we will dive into searching algorithms such as linear search, binary search and more. Our next mission will be to attain high performance by implementing algorithms to string datatypes and implementing hash structures in algorithm design. We'll also analyze Brute Force algorithms, Greedy algorithms, and more. By the end of the book, you'll know how to build components that are easy to understand, debug, and use in different applications.
Table of Contents (16 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Introduction to basic C++


Before we go through the data structures and algorithms in C++, we need to have a strong, fundamental understanding of the language itself. In this section, we are going to build a simple program, build it, and then run it. We are also going to discuss the fundamental and advanced data types, and before we move on to algorithm analysis, we are going to discuss control flow in this section.

Creating your first code in C++

In C++, the code is executed from the main() function first. The function itself is a collection of statements to perform a specific task. As a result of this, a program in C++ has to contain at least one function named main(). The following code is the simplest program in C++ that will be successfully compiled and run:

    int main()
    {
      return 0;
    }

Suppose the preceding code is saved as a simple.cpp file. We can compile the code using the g++ compiler by running the following compiling command on the console from the active directory where the simple.cpp file is placed:

g++ simple.cpp

If no error message appears, the output file will be generated automatically. If we run the preceding compiling command on a Windows console, we will get a file named a.exe. However, if we run the command on Bash shells, such as Linux or macOS, we will get a file named a.out.

We can specify the output file name using the -o option followed by the desired filename. For instance, the following compiling command will produce the output file named simple.out:

g++ simple.cpp -o simple.out

Indeed, when we run the output file (by typing a and then pressing Enter on a Windows console or by typing ./a.out and then pressing Enter on Bash shell), we won't see anything on the console window. This is because we don't print anything to the console yet. To make our simple.cpp file meaningful, let's refactor the code so that it can receive the input data from the user and then print the data back to the user. The refactored code should look as follows:

    // in_out.cpp
    #include <iostream>

    int main ()
    {
      int i;
      std::cout << "Please enter an integer value: ";
      std::cin >> i;
      std::cout << "The value you entered is " << i;
      std::cout << "\n";
      return 0;
   }

As we can see in the preceding code, we appended several lines of code so that the program can print to the console and the user can give an input to the program. Initially, the program displays text that asks the user to input an integer number. After the user types the desired number and presses Enter, the program will print that number. We also defined a new variable named i of the int data type. This variable is used to store data in an integer data format (we will talk about variables and data types in the upcoming section).

Suppose we save the preceding code as in_out.cpp; we can compile it using the following command:

g++ in_out.cpp

If we then run the program, we will get the following output on the console (I chose the number 3 in this example):

Now, we know that to print text to the console, we use the std::cout command, and to give some inputs to the program, we use the std::cin command. In the in_out.cpp file, we also see #include <iostream> at the beginning of the file. It's used to tell the compiler where to find the implementation of the std::cout and std::cin commands since their implementation is stated in the iostream header file.

And at the very beginning of the file, we can see that the line begins with double slashes (//). This means that the line won't be considered as code, so the compiler will ignore it. It's used to comment or mark an action in the code so that other programmers can understand our code.

Enhancing code development experience with IDE

So far, we have been able to create a C++ code, compile the code, and run the code. However, it will be boring if we have to compile the code using the Command Prompt and then execute the code afterwards. To ease our development process, we will use an integrated development environment (IDE) so that we can compile and run the code with just a click. You can use any C++ IDE available on the market, either paid or free. However, I personally chose Code::Blocks IDE since it's free, open source, and cross-platform so it can run on Windows, Linux, and macOS machines. You can find the information about this IDE, such as how to download, install, and use it on its official website at http://www.codeblocks.org/.

Actually, we can automate the compiling process using a toolchain such as Make or CMake. However, this needs further explanation, and since this book is intended to discuss data structures and algorithms, the toolchain explanation will increase the total pages of the book, and so we will not discuss this here. In this case, the use of IDE is the best solution to automate the compiling process since it actually accesses the toolchain as well.

After installing Code::Blocks IDE, we can create a new project by clicking on the File menu, then clicking New, and then selecting Project. A new window will appear and we can select the desired project type. For most examples in this book, we will use the Console Application as the project type. Press the Go button to continue.

On the upcoming window, we can specify the language, which is C++, and then define the project name and destination location (I named the project FirstApp). After finishing the wizard, we will have a new project with a main.cpp file containing the following code:

    #include <iostream>

    using namespace std;

    int main()
    {
      cout << "Hello world!" << endl;
      return 0;
    }

Now, we can compile and run the preceding code by just clicking the Build and run option under the Build menu. The following console window will appear:

In the preceding screenshot, we see the console using namespace std in the line after the #include <iostream> line. The use of this line of code is to tell the compiler that the code uses a namespace named std. As a result, we don't need to specify the std:: in every invocation of the cin and cout commands. The code should be simpler than before.

Defining the variables using fundamental data types

In the previous sample codes, we dealt with the variable (a placeholder is used to store a data element) so that we can manipulate the data in the variable for various operations. In C++, we have to define a variable to be of a specific data type so it can only store the specific type of variable that was defined previously. Here is a list of the fundamental data types in C++. We used some of these data types in the previous example:

  • Boolean data type (bool), which is used to store two pieces of conditional data only—true or false
  • Character data type (charwchar_tchar16_t, and char32_t), which is used to store a single ASCII character
  • Floating point data type (floatdouble, and long double), which is used to store a number with a decimal
  • Integer data type (shortintlong, and long long), which is used to store a whole number
  • No data (void), which is basically a keyword to use as a placeholder where you would normally put a data type to represent no data

There are two ways to create a variable—by defining it or by initializing it. Defining a variable will create a variable without deciding upon its initial value. The initializing variable will create a variable and store an initial value in it. Here is the code snippet for how we can define variables:

    int iVar;
    char32_t cVar;
    long long llVar;
    bool boVar;

And here is the sample code snippet of how initializing variables works:

    int iVar = 100;
    char32_t cVar = 'a';
    long long llVar = 9223372036854775805;
    bool boVar = true;

The preceding code snippet is the way we initialize the variables using the copy initialization technique. In this technique, we assign a value to the variable using an equals sign symbol (=). Another technique we can use to initialize a variable is the direct initialization technique. In this technique, we use parenthesis to assign a value to the variable. The following code snippet uses this technique to initialize the variables:

    int iVar(100);
    char32_t cVar('a');
    long long llVar(9223372036854775805);
    bool boVar(true);

Besides copy initialization and direct initialization techniques, we can use uniform initialization by utilizing curly braces. The following code snippet uses the so-called brace-initialization technique to initialize the variables:

    int iVar{100};
    char32_t cVar{'a'};
    long long llVar{9223372036854775805};
    bool boVar{true};

We cannot define a variable with a void data type such as void vVar because when we define a variable, we have to decide what data type we are choosing so that we can store the data in the variable. If we define a variable as void, it means that we don't plan to store anything in the variable.

Controlling the flow of the code

As we discussed earlier, the C++ program is run from the main() function by executing each statement one by one from the beginning to the end. However, we can change this path using flow control statements. There are several flow control statements in C++, but we are only going to discuss some of them, since these are the ones that are going to be used often in algorithm design.

Conditional statement

One of the things that can make the flow of a program change is a conditional statement. By using this statement, only the line in the true condition will be run. We can use the if and else keywords to apply this statement.

Let's modify our previous in_out.cpp code so that it uses the conditional statement. The program will only decide whether the input number is greater than 100 or not. The code should be as follows:

    // If.cbp
    #include <iostream>

    using namespace std;

    int main ()
    {
      int i;
      cout << "Please enter an integer value: ";
      cin >> i;
      cout << "The value you entered is ";

      if(i > 100)
         cout << "greater than 100.";
      else
         cout << "equals or less than 100.";

      cout << endl;
      return 0;
    }

As we can see, we have a pair of the if and else keywords that will decide whether the input number is greater than 100 or not. By examining the preceding code, only one statement will be executed inside the conditional statement, either the statement under the if keyword or the statement under the else keyword.

If we build and run the preceding code, we will see the following console window:

From the preceding console window, we can see that the line std::cout << "equals or less than 100."; is not executed at all since we have input a number that is greater than 100.

Also, in the if...else condition, we can have more than two conditional statements. We can refactor the preceding code so that it has more than two conditional statements, as follows:

    // If_ElseIf.cbp
    #include <iostream>

    using namespace std;

   int main ()
   {
      int i;
      cout << "Please enter an integer value: ";
      cin >> i;
      cout << "The value you entered is ";

      if(i > 100)
          cout << "greater than 100.";
      else if(i < 100)
          cout << "less than 100.";
      else
          cout << "equals to 100";

      cout << endl;
      return 0;
   }

Another conditional statement keyword is switch. Before we discuss this keyword, let's create a simple calculator program that can add two numbers. It should also be capable of performing the subtract, multiply, and divide operations. We will use the if...else keyword first. The code should be as follows:

    // If_ElseIf_2.cbp
    #include <iostream>

    using namespace std;

    int main ()
    {
      int i, a, b;

      cout << "Operation Mode: " << endl;
      cout << "1. Addition" << endl;
      cout << "2. Subtraction" << endl;
      cout << "3. Multiplication" << endl;
      cout << "4. Division" << endl;
      cout << "Please enter the operation mode: ";
      cin >> i;

      cout << "Please enter the first number: ";
      cin >> a;
      cout << "Please enter the second number: ";
      cin >> b;

      cout << "The result of ";

      if(i == 1)
          cout << a << " + " << b << " is " << a + b;
      else if(i == 2)
          cout << a << " - " << b << " is " << a - b;
      else if(i == 3)
          cout << a << " * " << b << " is " << a * b;
      else if(i == 4)
          cout << a << " / " << b << " is " << a / b;

      cout << endl;
      return 0;
    }

As we can see in the preceding code, we have four options that we have to choose from. We use the if...else conditional statement for this purpose. The output of the preceding code should be as follows:

However, we can use the switch keyword as well. The code should be as follows after being refactored:

    // Switch.cbp
    #include <iostream>

    using namespace std;

    int main ()
    {
       int i, a, b;

       cout << "Operation Mode: " << endl;
       cout << "1. Addition" << endl;
       cout << "2. Subtraction" << endl;
       cout << "3. Multiplication" << endl;
       cout << "4. Division" << endl;
       cout << "Please enter the operation mode: ";
       cin >> i;

       cout << "Please enter the first number: ";
       cin >> a;
       cout << "Please enter the second number: ";
       cin >> b;

       cout << "The result of ";

       switch(i)
       {
         case 1:
              cout << a << " + " << b << " is " << a + b;
              break;
         case 2:
              cout << a << " - " << b << " is " << a - b;
              break;
         case 3:
              cout << a << " * " << b << " is " << a * b;
              break;
         case 4:
              cout << a << " / " << b << " is " << a / b;
              break;
        }

     cout << endl;
     return 0;
    }

And if we run the preceding code, we will get the same output compared with the If_ElseIf_2.cbp code.

Loop statement

There are several loop statements in C++, and they are forwhile, and do...while. The for loop is usually used when we know how many iterations we want, whereas while and do...while will iterate until the desired condition is met.

Suppose we are going to generate ten random numbers between 0 to 100; the for loop is the best solution for it since we know how many numbers we need to generate. For this purpose, we can create the following code:

// For.cbp
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

int GenerateRandomNumber(int min, int max)
{
    // static used for efficiency,
    // so we only calculate this value once
    static const double fraction =
        1.0 / (static_cast<double>(RAND_MAX) + 1.0);

    // evenly distribute the random number
    // across our range
    return min + static_cast<int>(
        (max - min + 1) * (rand() * fraction));
}

int main()
{
    // set initial seed value to system clock
    srand(static_cast<unsigned int>(time(0)));

    // loop ten times
    for (int i=0; i < 10; ++i)
    {

        cout << GenerateRandomNumber(0, 100) << " ";
    }
    cout << "\n";

    return 0;
}

From the preceding code, we create another function besides main(), that is, GenerateRandomNumber(). The code will invoke the function ten times using the for loop, as we can see in the preceding code. The output we will get should be as follows:

Back to the others loop statements which we discussed earlier, which are while and do...while loop. They are quite similar based on their behavior. The difference is when we use the while loop, there is a chance the statement inside the loop scope is not run, whereas the statement in the loop scope must be run at least once in the do...while loop.

Now, let's create a simple game using those loop statements. The computer will generate a number between 1 to 100, and then the user has to guess what number has been generated by the computer. The program will give a hint to the user just after she or he inputs the guessed number. It will tell the user whether the number is greater than the computer's number or lower than it. If the guessed number matches with the computer's number, the game is over. The code should be as follows:

// While.cbp
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

int GenerateRandomNumber(int min, int max)
{
    // static used for efficiency,
    // so we only calculate this value once
    static const double fraction =
        1.0 / (static_cast<double>(RAND_MAX) + 1.0);

    // evenly distribute the random number
    // across our range
    return min + static_cast<int>(
        (max - min + 1) * (rand() * fraction));
}

int main()
{
    // set initial seed value to system clock
    srand(static_cast<unsigned int>(time(0)));

    // Computer generate random number
    // between 1 to 100
    int computerNumber = GenerateRandomNumber(1, 100);

    // User inputs a guessed number
    int userNumber;
    cout << "Please enter a number between 1 to 100: ";
    cin >> userNumber;

    // Run the WHILE loop
    while(userNumber != computerNumber)
    {
        cout << userNumber << " is ";
        if(userNumber > computerNumber)
            cout << "greater";
        else
            cout << "lower";
        cout << " than computer's number" << endl;
        cout << "Choose another number: ";
        cin >> userNumber;
    }

    cout << "Yeeaayy.. You've got the number." << endl;
    return 0;
}

From the preceding code, we can see that we have two variables, computerNumber and userNumber, handling the number that will be compared. There will be a probability that computerNumber is equal to userNumber. If this happens, the statement inside the while loop scope won't be executed at all. The flow of the preceding program can be seen in the following output console screenshot:

We have successfully implemented the while loop in the preceding code. Although the while loop is similar to the do...while loop, as we discussed earlier, we cannot refactor the preceding code by just replacing the while loop with the do...while loop. However, we can create another game as our example in implementing the do...while loop. However, now the user has to choose a number and then the program will guess it. The code should be as follows:

// Do-While.cbp
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

int GenerateRandomNumber(int min, int max)
{
    // static used for efficiency,
    // so we only calculate this value once
    static const double fraction =
        1.0 / (static_cast<double>(RAND_MAX) + 1.0);

    // evenly distribute the random number
    // across our range
    return min + static_cast<int>(
        (max - min + 1) * (rand() * fraction));
}

int main()
{
    // set initial seed value to system clock
    srand(static_cast<unsigned int>(time(0)));

    char userChar;

    int iMin = 1;
    int iMax = 100;
    int iGuess;

    // Menu display
    cout << "Choose a number between 1 to 100, ";
    cout << "and I'll guess your number." << endl;
    cout << "Press L and ENTER if my guessed number is lower than 
     yours";
    cout << endl;
    cout << "Press G and ENTER if my guessed number is greater than 
     yours";
    cout << endl;
    cout << "Press Y and ENTER if I've successfully guessed your 
     number!";
    cout << endl << endl;

    // Run the DO-WHILE loop
    do
    {
        iGuess = GenerateRandomNumber(iMin, iMax);
        cout << "I guess your number is " << iGuess << endl;
        cout << "What do you think? ";
        cin >> userChar;
        if(userChar == 'L' || userChar == 'l')
            iMin = iGuess + 1;
        else if(userChar == 'G' || userChar == 'g')
            iMax = iGuess - 1;

    }
    while(userChar != 'Y' && userChar != 'y');

    cout << "Yeeaayy.. I've got your number." << endl;
    return 0;
}

As we can see in the preceding code, the program has to guess the user's number at least once, and if it's lucky, the guessed number matches with the user's number, so that we use the do...while loop here. When we build and run the code, we will have an output similar to the following screenshot:

In the preceding screenshot, I chose number 56. The program then guessed 81. Since the number is greater than my number, the program guessed another number, which is 28. It then guessed another number based on the hint from me until it found the correct number. The program will leave the do...while loop if the user presses y, as we can see in the preceding code.

Leveraging the variable capability using advanced data types

We have discussed the fundamental data type in the previous section. This data type is used in defining or initializing a variable to ensure that the variable can store the selected data type. However, there are other data types that can be used to define a variable. They are enum (enumeration) and struct.

Enumeration is a data type that has several possible values and they are defined as the constant which is called enumerators. It is used to create a collection of constants. Suppose we want to develop a card game using C++. As we know, a deck of playing cards contains 52 cards, which consists of four suits (Clubs, Diamonds, Hearts, and Spades) with 13 elements in each suit. We can notate the card deck as follows:

enum CardSuits
{
    Club,
    Diamond,
    Heart,
    Spade
};

enum CardElements
{
    Ace,
    Two,
    Three,
    Four,
    Five,
    Six,
    Seven,
    Eight,
    Nine,
    Ten,
    Jack,
    Queen,
    King
};

If we want to apply the preceding enum data types (CardSuits and CardElements), we can use the following variable initialization:

CardSuits suit = Club;
CardElements element = Ace;

Actually, enums always contain integer constants. The string we put in the enum element is the constant name only. The first element holds a value of 0, except we define another value explicitly. The next elements are in an incremental number from the first element. So, for our preceding CardSuits enum, Club is equal to 0, and the Diamond, Heart, and Spade are 1, 2, and 3, respectively. 

Now, let's create a program that will generate a random card. We can borrow the GenerateRandomNumber() function from our previous code. The following is the complete code for this purpose:

// Enum.cbp
#include <iostream>
#include <cstdlib>
#include <ctime>

using namespace std;

enum CardSuits
{
    Club,
    Diamond,
    Heart,
    Spade
};

enum CardElements
{
    Ace,
    Two,
    Three,
    Four,
    Five,
    Six,
    Seven,
    Eight,
    Nine,
    Ten,
    Jack,
    Queen,
    King
};

string GetSuitString(CardSuits suit)
{
    string s;

    switch(suit)
    {
        case Club:
            s = "Club";
            break;
        case Diamond:
            s = "Diamond";
            break;
        case Heart:
            s = "Heart";
            break;
        case Spade:
            s = "Spade";
            break;
    }

    return s;
}

string GetElementString(CardElements element)
{
    string e;

    switch(element)
    {
        case Ace:
            e = "Ace";
            break;
        case Two:
            e = "Two";
            break;
        case Three:
            e = "Three";
            break;
        case Four:
            e = "Four";
            break;
        case Five:
            e = "Five";
            break;
        case Six:
            e = "Six";
            break;
        case Seven:
            e = "Seven";
            break;
        case Eight:
            e = "Eight";
            break;
        case Nine:
            e = "Nine";
            break;
        case Ten:
            e = "Ten";
            break;
        case Jack:
            e = "Jack";
            break;
        case Queen:
            e = "Queen";
            break;
        case King:
            e = "King";
            break;
    }

    return e;
}

int GenerateRandomNumber(int min, int max)
{
    // static used for efficiency,
    // so we only calculate this value once
    static const double fraction =
        1.0 / (static_cast<double>(RAND_MAX) + 1.0);

    // evenly distribute the random number
    // across our range
    return min + static_cast<int>(
        (max - min + 1) * (rand() * fraction));
}

int main()
{
    // set initial seed value to system clock
    srand(static_cast<unsigned int>(time(0)));

    // generate random suit and element card
    int iSuit = GenerateRandomNumber(0, 3);
    int iElement = GenerateRandomNumber(0, 12);

    CardSuits suit = static_cast<CardSuits>(
        iSuit);
    CardElements element = static_cast<CardElements>(
        iElement);

    cout << "Your card is ";
    cout << GetElementString(element);
    cout << " of " << GetSuitString(suit) << endl;

    return 0;
}

From the preceding code, we can see that we can access the enum data by using an integer value. However, we have to cast the int value so that it can fit the enum data by using static_cast<>, which is shown as follows:

int iSuit = GenerateRandomNumber(0, 3);
int iElement = GenerateRandomNumber(0, 12);

CardSuits suit = static_cast<CardSuits>(iSuit);
CardElements element = static_cast<CardElements>(iElement);

If we build and run the code, we will get the following console output:

Another advanced data type we have in C++ is structs. It is an aggregate data type which groups multiple individual variables together. From the preceding code, we have the suit and element variables that can be grouped as follows:

struct Cards
{
    CardSuits suit;
    CardElements element;
};

If we add the preceding struct to our preceding Enum.cbp code, we just need to refactor the main() function as follows:

int main()
{
    // set initial seed value to system clock
    srand(static_cast<unsigned int>(time(0)));

    Cards card;
    card.suit = static_cast<CardSuits>(
        GenerateRandomNumber(0, 3));
    card.element = static_cast<CardElements>(
        GenerateRandomNumber(0, 12));

    cout << "Your card is ";
    cout << GetElementString(card.element);
    cout << " of " << GetSuitString(card.suit) << endl;

    return 0;
}

If we run the preceding code (you can find the code as Struct.cbp in the repository), we will get the same output as Enum.cbp.