C++ Data Structures and Algorithm Design Principles

C++ Data Structures and Algorithm Design Principles

By : John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Buy this Book

C++ Data Structures and Algorithm Design Principles

By: John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Buy this Book

Overview of this book

C++ is a mature multi-paradigm programming language that enables you to write high-level code with a high degree of control over the hardware. Today, significant parts of software infrastructure, including databases, browsers, multimedia frameworks, and GUI toolkits, are written in C++. This book starts by introducing C++ data structures and how to store data using linked lists, arrays, stacks, and queues. In later chapters, the book explains the basic algorithm design paradigms, such as the greedy approach and the divide-and-conquer approach, which are used to solve a large variety of computational problems. Finally, you will learn the advanced technique of dynamic programming to develop optimized implementations of several algorithms discussed in the book. By the end of this book, you will have learned how to implement standard data structures and algorithms in efficient and scalable C++ 14 code.

About the Book

Hardware Requirements

Software Requirements

Installation and Setup

Installing the Code Bundle

Additional Resources

Free Chapter

1. Lists, Stacks, and Queues

Introduction

Contiguous Versus Linked Data Structures

std::deque – Special Version of std::vector

Container Adaptors

Benchmarking

Summary

2. Trees, Heaps, and Graphs

Introduction

Non-Linear Problems

Tree – It's Upside Down!

Variants of Trees

Heaps

Graphs

Summary

3. Hash Tables and Bloom Filters

Introduction

Hash Tables

Collisions in Hash Tables

C++ Hash Tables

Bloom Filters

Summary

4. Divide and Conquer

Introduction

Binary Search

Understanding the Divide-and-Conquer Approach

Sorting Using Divide and Conquer

C++ Standard Library Tools for Divide and Conquer

Dividing and Conquering at a Higher Abstraction Level – MapReduce

Summary

5. Greedy Algorithms

Introduction

Basic Greedy Algorithms

The Knapsack Problem(s)

The Vertex Coloring Problem

Summary

6. Graph Algorithms I

Introduction

The Graph Traversal Problem

Prim's MST Algorithm

Dijkstra's Shortest Path Algorithm

Summary

7. Graph Algorithms II

Introduction

Revisiting the Shortest Path Problem

The Bellman-Ford Algorithm

The Bellman-Ford Algorithm (Part II) – Negative Weight Cycles

Johnson's Algorithm

Strongly Connected Components

Kosaraju's Algorithm

Choosing the Right Approach

Summary

8. Dynamic Programming I

Introduction

What Is Dynamic Programming?

Memoization – The Top-Down Approach

Tabulation – the Bottom-Up Approach

Subset Sum Problem

Dynamic Programming on Strings and Sequences

Activity 21: Melodic Permutations

Summary

9. Dynamic Programming II

Introduction

An Overview of P versus NP

Reconsidering the Subset Sum Problem

The Knapsack Problem

Graphs and Dynamic Programming

Summary

Appendix

Chapter 1: Lists, Stacks, and Queues

Chapter 2: Trees, Heaps, and Graphs

Chapter 3: Hash Tables and Bloom Filters

Chapter 4: Divide and Conquer

Chapter 5: Greedy Algorithms

Chapter 6: Graph Algorithms I

Chapter 7: Graph Algorithms II

Chapter 8: Dynamic Programming I

Chapter 9: Dynamic Programming II

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Chapter 8: Dynamic Programming I

Activity 18: Travel Itinerary

Let's begin by considering the base case and recurrence relation for this problem. Unlike some of the other examples we have discussed in this chapter, this particular problem has just one base case – the point at which the destination has been reached. The intermediate states are also quite simple: given a location at index i that has a distance limit of x, we can travel to any location between indices i + 1 and i + x (inclusive). For example, let's consider the following two cities:

City 1: distance[1] = 2
City 2: distance[2] = 1

Let's say we wanted to calculate the number of ways to reach the city at index 3. Because we can reach city 3 from both city 1 and city 2, the number of ways to reach city 3 is equivalent to the sum of the number of ways to reach city 1 and the number of ways to reach city 2. This recurrence is quite similar to the Fibonacci series, except that the number of previous states from which the current state's substructure is formed is variable according to the values of distance.

So, let's say we have the following four cities:

[1]: distance = 5

[2]: distance = 3

[3]: distance = 1

[4]: distance = 2

From this, we want to calculate the number of ways to travel to city 5. To do this, we can formulate the substructure as follows:

Cities reachable from index [1] -> { 2 3 4 5 6 }

Cities reachable from index [2] -> { 3 4 5 }

Cities reachable from index [3] -> { 4 }

Cities reachable from index [4] -> { 5 6 }

We can now invert this logic to find the cities from which we can travel through to reach a given location:

Cities that connect to index [1] -> START

Cities that connect to index [2] -> { 1 }

Cities that connect to index [3] -> { 1 2 }

Cities that connect to index [4] -> { 1 2 3 }

Cities that connect to index [5] -> { 1 2 }

Taking this a step further, we can now devise an outline of the state logic:

Ways to reach City 1 = 1 (START)

Ways to reach City 2 = 1

1 " 2

Ways to reach City 3 = 2

1 " 2 " 3

1 " 3

Ways to reach City 4 = 4

1 " 2 " 3 " 4

1 " 2 " 4

1 " 3 " 4

1 " 4

Ways to reach City 5 = 6

1 " 2 " 3 " 4 " 5

1 " 2 " 4 " 5

1 " 2 " 5

1 " 3 " 4 " 5

1 " 4 " 5

1 " 5

Thus, we can define the recurrence as follows:

Base case:
F(1) = 1 (We have reached the destination)
Recurrence:

Figure 8.22: Formula for defining recurrence

In other words, the number of ways to reach a given location is equal to the sum of the number of ways to reach each location that connects to it. Using this logic, a recursive function for solving this problem might look like this:

F(n) -> number of ways to reach n'th location

F(i) =

if i = N:

return 1

Otherwise:

result = 0

for j = 1 to distance[i]:

result = result + F(i + j)

return result

Now that we have a functional definition of the problem's states, let's begin implementing it in code.

For this problem, we will include the following headers and the std namespace:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
Because the outputs of this problem require the computation of numbers that exceed 32 bits, we will use long long int for the result. To avoid having to write this repeatedly, we will use a typedef statement to abbreviate it:
typedef long long LL;
Finally, we will define the modulus value for outputting the results:
const LL MOD = 1000000007;
Handling the input and output in this problem can be implemented very simply:
int main()
{
    int n;
    cin >> n;

vector<int> distance(n);
    for(int i = 0; i < n; i++)
    {
        cin >> distance[i];
    }
    LL result = TravelItinerary(n, distance);
    cout << result << endl;
    return 0;
}
We will now define a function called TravelItinerary() that takes n and distance as arguments and returns a long integer:
LL TravelItinerary(int n, vector<int> distance)
{
...
}
Now, we must convert the recursive algorithm we presented earlier into a bottom-up approach. In pseudocode, this might appear as follows:
DP -> Array of size N + 1
DP[0] = 1 (There is one way to reach the starting location)
for i = 0 to N-1:
    for j = 1 to distance[i]:

        DP[i + j] += DP[i]
return DP[N]
To code this in C++, we will first declare a one-dimensional DP table of size n + 1 and initialize all of its elements to 0. Then, we will set its first element to 1 to represent the base case:
vector<LL> DP(n + 1, 0);
DP[0] = 1;
To implement the recurrence we described previously, we will first reverse the distance array so that we are essentially beginning our calculations from the destination index. There are several reasons for this, but the primary reason is so that our algorithm processes the current state by combining the results of earlier states, as opposed to calculating future states from the results of the current state. Though the logic described in the pseudocode will produce the correct result, it is generally preferable to formulate bottom-up logic in terms of how the solutions of the previous states form the result of the immediate state:
reverse(distance.begin(), distance.end());
DP[0] = 1;
for(int i = 1; i <= n; i++)
{
    int dist = distance[i-1];
    for(int j = 1; j <= dist; j++)
    {
        DP[i] = (DP[i] + DP[i – j]) % MOD;
    }
}
return DP[n];
This is certainly a viable solution to the problem that will be completely satisfactory in the vast majority of cases. However, since dynamic programming is first and foremost an optimization technique, we should still ask ourselves if a better approach exists.
Handling the Extra Credit Test Case
As both n and the maximum distance value increase, even the preceding algorithm will eventually prove to be rather inefficient. If n = 10000000 and the distance values can vary between 1 and 10000, then the inner for loop would have to perform nearly 100000000000 iterations in the worst case. Thankfully, there is a very simple technique that will allow us to completely remove the inner loop, which means we will have to perform exactly n iterations for any input.
To handle this reduction, we will create a prefix sum array, which will allow us to calculate the range sums we previously handled by the inner loop in constant time. If you are unfamiliar with this technique, the basic concept is as follows:
- Create an array called sums that has a length equal to the total number of values to sum plus one, with all the elements initialized to 0.
- For each index i from 0 to n, use sum[i + 1] = sum[i] + distance[i].
- After the sums have been calculated, the sum of all elements in any range [L, R] will be equal to sum[R+1] – sum[L].

Take a look at the following example:

0 1 2 3 4

A = { 3 1 10 2 5 }

0 1 2 3 4 5

sums = { 0 3 4 14 16 21 }

range(1, 3) = A[1] + A[2] + A[3]

= 1 + 10 + 2

= 13

sums[4] – sums[1] = 13

range(3, 4) = A[3] + A[4]

= 2 + 5

= 7

sums[5] – sums[3] = 7

We can implement this approach in our function as follows:
LL TravelItinerary(int n, vector<int> distance)
{
    vector<LL> DP(n + 1, 0);
    vector<LL> sums(n + 2, 0);
    DP[0] = sums[1] = 1;
    reverse(distance.begin(), distance.end());
    for(int i = 1; i <= n; i++)
    {
        int dist = distance[i-1];
        LL sum = sums[i] – sums[i – dist];
        DP[i] = (DP[i] + sum) % MOD;
        sums[i + 1] = (sums[i] + DP[i]) % MOD;
    }
    return DP[n];
}
Now, there is still one more problem that you are likely to encounter, and that is that the result returned by the preceding function will be negative. This is due to the fact that the modulo operations are causing higher-indexed values in sums to be less than lower-indexed values, which leads to a negative result when subtracting. This sort of issue can be very common in problems requiring frequent modulo operations on very large numbers, but can be easily fixed by modifying the return statement slightly:
return (DP[n] < 0) ? DP[n] + MOD : DP[n];

With these slight modifications, we now have an elegant and efficient solution to the problem that can handle massive input arrays in a fraction of a second!

Activity 19: Finding the Longest Common Subsequence by Using Memoization

As we did with the subset sum problem, we will include each new approach within the same code file so that we can compare their relative performance. To that end, let's define our GetTime() function in the same way as before:
vector<string> types =
{
    "BRUTE FORCE",
    "MEMOIZATION",
    "TABULATION"
};
const int UNKNOWN = INT_MAX;
void GetTime(clock_t &timer, string type)
{
    timer = clock() - timer;
    cout << "TIME TAKEN USING " << type << ": " << fixed << setprecision(5) << (float)timer / CLOCKS_PER_SEC << " SECONDS" << endl;
    timer = clock();
}
Now, let's define our new function, LCS_Memoization(), which will take the same arguments as LCS_BruteForce(), except that subsequence will instead be replaced by a reference to a two-dimensional integer vector, memo:
int LCS_Memoization(string A, string B, int i, int j, vector<vector<int>> &memo)
{
……
}
Our code for this function will also be quite similar to LCS_BruteForce(), except we will invert the logic by recursively traversing the prefixes of the two strings (beginning with the complete strings) and storing the results in our memo table at each step:
// Base case — LCS is always zero for empty strings
if(i == 0 || j == 0)
{
    return 0;
}
// Have we found a result for the prefixes of the two strings?
if(memo[i - 1][j - 1] != UNKNOWN)
{
    // If so, return it
    return memo[i - 1][j - 1];
}
// Are the last characters of A's prefix and B's prefix equal?
if(A[i-1] == B[j-1])
{
    // LCS for this state is equal to 1 plus the LCS of the prefixes of A and B, both reduced by one character
    memo[i-1][j-1] = 1 + LCS_Memoization(A, B, i-1, j-1, memo);
    // Return the cached result
    return memo[i-1][j-1];
}
// If the last characters are not equal, LCS for this state is equal to the maximum LCS of A's prefix reduced by one character and B's prefix, and B's prefix reduced by one character and A's prefix
memo[i-1][j-1] = max(LCS_Memoization(A, B, i-1, j, memo),
                 LCS_Memoization(A, B, i, j-1, memo));
return memo[i-1][j-1];
Now, let's redefine our main() function to perform both approaches and display the time taken by each:
int main()
{
    string A, B;
    cin >> A >> B;
    int tests = 2;
    clock_t timer = clock();
    for(int i = 0; i < tests; i++)
    {
        int LCS;
        switch(i)
        {
            case 0:
            {
                LCS = LCS_BruteForce(A, B, 0, 0, {});
            #if DEBUG
                PrintSubsequences(A, B);
            #endif
                break;
            }
            case 1:
            {
                vector<vector<int>> memo(A.size(), vector<int>(B.size(), UNKNOWN));
                LCS = LCS_Memoization(A, B, A.size(), B.size(), memo);
                break;
            }
        }
        cout << "Length of the longest common subsequence of " << A << " and " << B << " is: " << LCS << ends;
        GetTime(timer, types[i]);
        cout << endl;
    }
    return 0;
}
Now, let's try performing our two algorithms on two new strings, ABCABDBEFBA and ABCBEFBEAB. Your program's output should be similar to the following:
SIZE = 3
    ABC________ ABC_______
SIZE = 4
    ABC_B______ ABCB______
    ABC_B______ ABC___B___
    ABC_B______ ABC______B
    ABC___B____ ABC______B
    ABC____E___ ABC____E__
    ABC______B_ ABC___B___
    ABC______B_ ABC______B
    ABC_______A ABC_____A_
SIZE = 5
    ABCAB______ ABC_____AB
    ABC_B_B____ ABCB_____B
    ABC_B__E___ ABCB___E__
    ABC_B____B_ ABCB__B___
    ABC_B____B_ ABCB_____B
    ABC_B_____A ABCB____A_
    ABC_B_B____ ABC___B__B
    ABC_B__E___ ABC___BE__
    ABC_B____B_ ABC___B__B
    ABC_B_____A ABC___B_A_
    ABC___BE___ ABC___BE__
    ABC____E_B_ ABC____E_B
    ABC____E__A ABC____EA_
    ABC_____FB_ ABC__FB___
    ABC______BA ABC___B_A_
SIZE = 6
    ABC_B_BE___ ABCB__BE__
    ABC_B__E_B_ ABCB___E_B
    ABC_B__E__A ABCB___EA_
    ABC_B___FB_ ABCB_FB___
    ABC_B____BA ABCB__B_A_
    ABC_B__E_B_ ABC___BE_B
    ABC_B__E__A ABC___BEA_
    ABC___BE_B_ ABC___BE_B
    ABC___BE__A ABC___BEA_
    ABC____EFB_ ABC_EFB___
    ABC_____FBA ABC__FB_A_
SIZE = 7
    ABC_B_BE_B_ ABCB__BE_B
    ABC_B_BE__A ABCB__BEA_
    ABC_B__EFB_ ABCBEFB___
    ABC_B___FBA ABCB_FB_A_
    ABC____EFBA ABC_EFB_A_
SIZE = 8
    ABC_B__EFBA ABCBEFB_A_
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING BRUTE FORCE: 0.00242 SECONDS
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING MEMOIZATION: 0.00003 SECONDS
Of course, the time taken by the brute-force approach is going to be affected by the additional step of printing out the subsequences. By running our code again after setting the DEBUG constant to 0, the output is now as follows:
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING BRUTE FORCE: 0.00055 SECONDS
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING MEMOIZATION: 0.00002 SECONDS
Now, let's try pushing the limits of our algorithm using two much larger strings, ABZCYDABAZADAEA and YABAZADBBEAAECYACAZ. You should get an output something like this:
Length of the longest common subsequence of ABZCYDABAZADAEA and YABAZADBBEAAECYACAZ is: 10
TIME TAKEN USING BRUTE FORCE: 8.47842 SECONDS
Length of the longest common subsequence of ABZCYDABAZADAEA and YABAZADBBEAAECYACAZ is: 10
TIME TAKEN USING MEMOIZATION: 0.00008 SECONDS
Note
The actual values for the time taken will vary depending on your system. Please note the difference in the values.

As we can clearly see, the gains in performance provided by memoization are quite significant!

Activity 20: Finding the Longest Common Subsequence Using Tabulation

As we did previously, we will add a new function, LCS_Tabulation(), to the same code file that contains our brute-force and memoized solutions.

Our LCS_Tabulation() function receives two arguments— strings A and B — and returns a string:
string LCS_Tabulation(string A, string B)
{
……
}
Our first step is to define our DP table, which we will represent as a two-dimensional vector of integers, with the first dimension's size equal to one greater than the size of string A, and the second dimension's size equal to one greater than the size of string B:
vector<vector<int>> DP(A.size() + 1, vector<int>(B.size() + 1));
Like the subset sum problem, all of our algorithm's logic can be contained within two nested loops, with the first one iterating from 0 to the size of A, and the second iterating from 0 to the size of B:
for(int i = 0; i <= A.size(); i++)
{
    for(int j = 0; j <= B.size(); j++)
    {
        ……
    }
}
Unlike the subset sum problem, our base case will not be handled prior to the execution of the loops, but rather at the beginning of each loop. This is because our base case will occur any time the prefix of A or B is empty (that is, i = 0 or j = 0). This is represented in our code as follows:
if(i == 0 || j == 0)
{
DP[i][j] = 0;
}
Now, we must handle the case where the characters at the end of A's prefix and B's prefix are equal. Remember that the LCS value for this state is always equal to 1, plus the LCS value of the state where both prefixes are one character smaller than they are currently. This can be represented as follows:
else if(A[i-1] == B[j-1])
{
DP[i][j] = DP[i-1][j-1] + 1;
}
For the final case, the end characters are not equal. For this state, we know that the LCS is equal to the maximum of the LCS of A's previous prefix and B's current prefix, and the LCS of B's previous prefix and A's current prefix. In terms of our table's structure, this is equivalent to saying that the LCS is equal to the maximum of the value contained in the same column and previous row of the table, and the value contained in the same row and previous column:
else
{
DP[i][j] = max(DP[i-1][j], DP[i][j-1]);
}
When we are done, the length of the longest common subsequence will be contained in DP[A.size()][B.size()] – the value of the LCS when the prefixes of both A and B are equal to the entire strings. Therefore, our complete DP logic is written as follows:
string LCS_Tabulation(string A, string B)
{
    vector<vector<int>> DP(A.size() + 1, vector<int>(B.size() + 1));
    for(int i = 0; i <= A.size(); i++)
    {
        for(int j = 0; j <= B.size(); j++)
        {
            if(i == 0 || j == 0)
            {
                DP[i][j] = 0;
            }
            else if(A[i-1] == B[j-1])
            {
                DP[i][j] = DP[i-1][j-1] + 1;
            }
            else
            {
                DP[i][j] = max(DP[i-1][j], DP[i][j-1]);
            }
        }
    }
    int length = DP[A.size()][B.size()];
    ……
}
At this point, we have discussed several ways to find the length of the longest common subsequence, but what if we also want to output its actual characters? Of course, our brute-force solution does this, but very inefficiently; however, using the results contained in the preceding DP table, we can use backtracking to reconstruct the LCS quite easily. Let's highlight the path we would need to follow in the table to accomplish this:
Figure 8.23: Activity 20 DP table
By collecting the characters associated with each column in the path where the value increases, we get the LCS ABCBEFBA.
Let's define a function called ReconstructLCS() that takes A, B, i, j, and DP as arguments. Our backtracking logic can be defined as follows:
if i = 0 or j = 0:
    Return an empty string
If the characters at the end of A's prefix and B's prefix are equal:
    Return the LCS of the next smaller prefix of both A and B, plus the equal character
Otherwise:
    If the value of DP(i - 1, j) is greater than the value of DP(i, j - 1):
      – Return the LCS of A's next smaller prefix with B's current prefix
      – Otherwise:
          Return the LCS of B's next smaller prefix with A's current prefix
In C++, this can be coded as follows:
string ReconstructLCS(vector<vector<int>> &DP, string &A, string &B, int i, int j)
{
    if(i == 0 || j == 0)
    {
        return "";
    }
    if(A[i-1] == B[j-1])
    {
        return ReconstructLCS(DP, A, B, i-1, j-1) + A[i-1];
    }
    else if(DP[i-1][j] > DP[i][j-1])
    {
        return ReconstructLCS(DP, A, B, i-1, j);
    }
    else
    {
        return ReconstructLCS(DP, A, B, i, j-1);
    }
}
Now, we can return the result of ReconstructLCS() in the final line of LCS_Tabulation():
string LCS_Tabulation(string A, string B)
{
    ……
    string lcs = ReconstructLCS(DP, A, B, A.size(), B.size());
    return lcs;
}
Our code in main() should now be modified to accommodate the addition of LCS_Tabulation():
int main()
{
    string A, B;
    cin >> A >> B;
    int tests = 3;
    clock_t timer = clock();
    for(int i = 0; i < tests; i++)
    {
        int LCS;
        switch(i)
        {
            ……
            case 2:
            {
                string lcs = LCS_Tabulation(A, B);
                LCS = lcs.size();
                cout << "The longest common subsequence of " << A << " and " << B << " is: " << lcs << endl;
                break;
            }
        }
        cout << "Length of the longest common subsequence of " << A << " and " << B << " is: " << LCS << endl;
        GetTime(timer, types[i]);
    }
    return 0;
}
Using the strings ABCABDBEFBA and ABCBEFBEAB, your program's output should be similar to this:
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING BRUTE FORCE: 0.00060 SECONDS
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING MEMOIZATION: 0.00005 SECONDS
The longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: ABCBEFBA
Length of the longest common subsequence of ABCABDBEFBA and ABCBEFBEAB is: 8
TIME TAKEN USING TABULATION: 0.00009 SECONDS
Note
The actual values for the time taken will vary depending on your system. Please note the difference in the values.

Now, we have looked at another detailed example of how the same logic can be applied to the same problem using different techniques and the corresponding effect this has on the execution time of the algorithm.

Activity 21: Melodic Permutations

The first question to ask ourselves is: what constitutes a single state in this problem?

Base case --> Empty set:

Consider each note in the melody.
For each subset of notes that was previously encountered, either append the current note or do nothing.
If the subset matches the target, add it to the solutions.

Given that our options are to either append a note to a previous subset or leave it as-is, we could restate the logic as follows:

For a given note in the melody, the count of subsets of size | n | containing the note is equal to the total count of all subsets of size | n - 1 | that did not contain the note.

So, each state can be expressed in two dimensions:

Dimension 1: The length of the melody considered so far.
Dimension 2: The resulting subset formed by taking a previously found subset and either appending the note located at index [length - 1] of the melody to it or doing nothing.

In pseudocode, the logic could be expressed as follows:

for i = 1 to length of melody (inclusive):

for each subset previously found:

DP(i, subset) = DP(i, subset) + DP(i - 1, subset)

DP(i, subset ∪ melody[i - 1]) = DP(i, subset ∪ melody[i - 1]) + DP(i - 1, subset)

So, the primary question now is, how can we represent these states?

Remember that for an n-element collection, there are a total of 2n subsets comprising it — for example, a set of 4 elements can be divided into a total of 24 (or 16) subsets:

S = { A, B, C, D }

{ } —> { _ _ _ _ }

{ A } —> { # _ _ _ }

{ B } —> { _ # _ _ }

{ C } —> { _ _ #_ }

{ D } —> { _ _ _ # }

{ A, B } —> { # # _ _ }

{ A, C } —> { # _ #_ }

{ A, D } —> { # _ _ # }

{ B, C } —> { _ # #_ }

{ B, D } —> { _ # _ # }

{ C, D } —> { _ _ # # }

{ A, B, C } —> { # # # _ }

{ A, B, D } —> { # # _ # }

{ A, C, D } —> { # _ # # }

{ B, C, D } —> { _ # # # }

{ A, B, C, D } —> { # # # # }

If we iterate from 0 to (24 - 1) inclusive in binary, we get the following numbers:

0 —> 0000 —> { _ _ _ _ }

1 —> 0001 —> { # _ _ _ }

2 —> 0010 —> { _ # _ _ }

3 —> 0011 —> { # # _ _ }

4 —> 0100 —> { _ _ # _ }

5 —> 0101 —> { # _ # _ }

6 —> 0110 —> { _ # # _ }

7 —> 0111 —> { # # # _ }

8 —> 1000 —> { _ _ _ # }

9 —> 1001 —> { # _ _ # }

10 —> 1010 —> { _ # _ # }

11 —> 1011 —> { # # _ # }

12 —> 1100 —> { _ _ # # }

13 —> 1101 —> { # _ # # }

14 —> 1110 —> { _ # # # }

15 —> 1111 —> { # # # # }

As we can see, the digits of each binary number from 0 to 2n correspond exactly to the indices of one possible subset of n elements. Since there are 12 notes in the scale, this means there is a total of 212 (or 4,096) possible subsets of notes. By mapping each note in the scale to a power of 2, we can use bitwise arithmetic to represent the subsets encountered across each state.

The following are the steps to solve this activity:

Moving on to the code, we should begin by including the following headers:
#include <iostream>
#include <vector>
#include <string>
#include <map>
using namespace std;
Let's start by handling the input in our main() function:
int main()
{
    int melodyLength;
    int setLength;
    cin >> melodyLength;
    vector<string> melody(melodyLength);
    for(int i = 0; i < melodyLength; i++)
    {
        cin >> melody[i];
    }
    cin >> setLength;
    vector<string> set(setLength);
    for(int i = 0; i < setLength; i++)
    {
        cin >> set[i];
    }
    ……
}
Now, let's write a function called ConvertNotes(),that receives a vector of note strings as input and returns a vector of their corresponding integer values. Each of the 12 total notes in the scale will need to be mapped to a particular bit (beginning with A), with enharmonically equivalent notes assigned to identical values. We will use std::map to handle the conversions:
vector<int> ConvertNotes(vector<string> notes)
{
    map<string, int> M =
    {
        { "A",  0 },
        { "A#", 1 },
        { "Bb", 1 },
        { "B",  2 },
        { "Cb", 2 },
        { "B#", 3 },
        { "C",  3 },
        { "C#", 4 },
        { "Db", 4 },
        { "D",  5 },
        { "D#", 6 },
        { "Eb", 6 },
        { "E",  7 },
        { "Fb", 7 },
        { "E#", 8 },
        { "F",  8 },
        { "F#", 9 },
        { "Gb", 9 },
        { "G",  10 },
        { "G#", 11 },
        { "Ab", 11 }
    };
    vector<int> converted;
    for(auto note : notes)
    {
        // Map to powers of 2
        converted.push_back(1 << M[note]);
    }
    return converted;
}
Now, we will define a function called CountMelodicPermutations() that takes two integer vectors, melody and set, as arguments and returns an integer:
int CountMelodicPermutations(vector<int> melody, vector<int> set)
{
……
}
Our first step is to define our target subset. We will do this using the bitwise or operator:
unsigned int target = 0;
for(auto note : set)
{
target |= note;
}
As an example, if our target set is { C, F#, A }, the mapping would look like this:
C  = 3
F# = 9
A  = 0
converted = { 23, 29, 20 } = { 8, 512, 1 }
target = (8 | 512 | 1) = 521
    0000001000
  + 0000000001
  + 1000000000
  = 1000001001
We will now define a two-dimensional DP table, with the first dimension initialized to melodyLength + 1, and the second dimension initialized to one greater than the maximum subset value (that is, 111111111111 = 212 - 1, so the second dimension will contain 212, or 4,096, elements):
vector<vector<int>> DP(melody.size() + 1, vector<int>(4096, 0));
Our DP formula can be defined as follows:
Base case:
    DP(0, 0) —> 1
Recurrence:
    DP(i, subset) —> DP(i, subset) + DP(i - 1, subset)
    DP(i, subset ∪ note[i-1]) —> DP(i, subset ∪ note[i]) + DP(i - 1, subset)
Here, i ranges from 1 to the length of the melody. We can write the preceding logic in C++ like this:
// Base case —> empty set
DP[0][0] = 1;
for(int i = 1; i <= melody.size(); i++)
{
    for(unsigned int subset = 0; subset < 4096; subset++)
    {
        // Keep results for previous values of i
        DP[i][subset] += DP[i-1][subset];
        // Add results for union of subset with melody[i-1]
        DP[i][subset | melody[i-1]] += DP[i-1][subset];
    }
}
// Solution
return DP[melody.size()][target];
Now, we can finish our main() function by calling CountMelodicPermutations and outputting the result:
int count = CountMelodicPermutations(ConvertNotes(melody), ConvertNotes(set));
cout << count << endl;

C++ Data Structures and Algorithm Design Principles

By : John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

C++ Data Structures and Algorithm Design Principles

By: John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Overview of this book

Related Content you might be interested in

Current Title:

C++ Data Structures and Algorithm Design Principles

Beginning Java Data Structures and Algorithms

PHP 7 Data Structures and Algorithms

Hands-On Data Structures and Algorithms with Python &ndash; Third Edition

Chapter 8: Dynamic Programming I

Activity 18: Travel Itinerary

Figure 8.22: Formula for defining recurrence

Activity 19: Finding the Longest Common Subsequence by Using Memoization

Note

Activity 20: Finding the Longest Common Subsequence Using Tabulation

Figure 8.23: Activity 20 DP table

Note

Activity 21: Melodic Permutations

Hands-On Data Structures and Algorithms with Python – Third Edition