C++ Data Structures and Algorithm Design Principles

C++ Data Structures and Algorithm Design Principles

By : John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Buy this Book

C++ Data Structures and Algorithm Design Principles

By: John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Buy this Book

Overview of this book

C++ is a mature multi-paradigm programming language that enables you to write high-level code with a high degree of control over the hardware. Today, significant parts of software infrastructure, including databases, browsers, multimedia frameworks, and GUI toolkits, are written in C++. This book starts by introducing C++ data structures and how to store data using linked lists, arrays, stacks, and queues. In later chapters, the book explains the basic algorithm design paradigms, such as the greedy approach and the divide-and-conquer approach, which are used to solve a large variety of computational problems. Finally, you will learn the advanced technique of dynamic programming to develop optimized implementations of several algorithms discussed in the book. By the end of this book, you will have learned how to implement standard data structures and algorithms in efficient and scalable C++ 14 code.

About the Book

Hardware Requirements

Software Requirements

Installation and Setup

Installing the Code Bundle

Additional Resources

Free Chapter

1. Lists, Stacks, and Queues

Introduction

Contiguous Versus Linked Data Structures

std::deque – Special Version of std::vector

Container Adaptors

Benchmarking

Summary

2. Trees, Heaps, and Graphs

Introduction

Non-Linear Problems

Tree – It's Upside Down!

Variants of Trees

Heaps

Graphs

Summary

3. Hash Tables and Bloom Filters

Introduction

Hash Tables

Collisions in Hash Tables

C++ Hash Tables

Bloom Filters

Summary

4. Divide and Conquer

Introduction

Binary Search

Understanding the Divide-and-Conquer Approach

Sorting Using Divide and Conquer

C++ Standard Library Tools for Divide and Conquer

Dividing and Conquering at a Higher Abstraction Level – MapReduce

Summary

5. Greedy Algorithms

Introduction

Basic Greedy Algorithms

The Knapsack Problem(s)

The Vertex Coloring Problem

Summary

6. Graph Algorithms I

Introduction

The Graph Traversal Problem

Prim's MST Algorithm

Dijkstra's Shortest Path Algorithm

Summary

7. Graph Algorithms II

Introduction

Revisiting the Shortest Path Problem

The Bellman-Ford Algorithm

The Bellman-Ford Algorithm (Part II) – Negative Weight Cycles

Johnson's Algorithm

Strongly Connected Components

Kosaraju's Algorithm

Choosing the Right Approach

Summary

8. Dynamic Programming I

Introduction

What Is Dynamic Programming?

Memoization – The Top-Down Approach

Tabulation – the Bottom-Up Approach

Subset Sum Problem

Dynamic Programming on Strings and Sequences

Activity 21: Melodic Permutations

Summary

9. Dynamic Programming II

Introduction

An Overview of P versus NP

Reconsidering the Subset Sum Problem

The Knapsack Problem

Graphs and Dynamic Programming

Summary

Appendix

Chapter 1: Lists, Stacks, and Queues

Chapter 2: Trees, Heaps, and Graphs

Chapter 3: Hash Tables and Bloom Filters

Chapter 4: Divide and Conquer

Chapter 5: Greedy Algorithms

Chapter 6: Graph Algorithms I

Chapter 7: Graph Algorithms II

Chapter 8: Dynamic Programming I

Chapter 9: Dynamic Programming II

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Chapter 9: Dynamic Programming II

Activity 22: Maximizing Profit

In this activity, we will optimize our inventory for sale to maximize our profits. Follow these steps to complete the activity:

Let's begin by including the following headers:
#include <iostream>
#include <vector>
using namespace std;
First, we will define a structure, Product, that encapsulates the data associated with each item:
struct Product
{
    int quantity;
    int price;
    int value;
    Product(int q, int p, int v)
        : quantity(q), price(p), value(v) {}
};
Next, we will handle the input in the main() function and populate an array of the Product type:
int main()
{
    int N, budget, capacity;
    cin >> N >> budget >> capacity;
    vector<Product> products;
    for(int i = 0; i < N; i++)
    {
        int quantity, cost, value;
        cin >> quantity >> cost >> value;
        products.push_back(Product(quantity, cost, value));
    }
...
return 0;
}
As with any DP algorithm, we must now define the states and base cases. We know that the subset of items that form the final result must match the following criteria:
    –  The sum of the cost of all the products in the subset must not exceed budget.
    –  The sum of the quantity of all the products in the subset must not exceed capacity.
    –  The sum of the value of all the products in the subset must be maximized.
Given these criteria, we can see that each state can be defined by the following parameters:
    –  The current item being considered
    –  The number of units previously purchased
    –  The total cost of the purchased items
    –  The total profit gained after selling the products at retail value
We can also conclude that a search will terminate when:
    –  All the items have been considered
    –  The total cost exceeds the budget
    –  The total number of units exceeds the capacity
Like the traditional 0-1 knapsack problem, we will consider each item from 0 to N-1 linearly. For each item at index i, our states can transition in one of two ways: by either including the current item or leaving it. Writing the recursive logic in pseudocode may look like this:
F(i, count, cost, total):
I        –> The index of the current item
Cost     –> The total money spent
count    –> The number of units purchased
total    –> The total profit value of the chosen items
Base cases:
    if i = N: return total
    if cost > budget: return 0
    if count > capacity: return 0
Recurrence:
F(i, count, cost, total) = maximum of:
F(i + 1, count + quantity[i], cost + price[i],
      total + value[i]) – Include the item
        AND
    F(i + 1, count, cost, total) – Leave as-is
As shown in the preceding code, the recurrence relation is defined according to the values of i, count, cost, and total. Converting this logic from top down to bottom up can be done like so:
Base case:
    DP(0, 0, 0) = 0 [Nothing has been chosen yet]
For i = 1 to N:
    Product -> quantity, price, value
    For cost = 0 to budget:
        For count = 0 to capacity:
            If price is greater than cost OR
           quantity is greater than count:
                DP(i, cost, count) = DP(i-1, cost, count)
            Otherwise:
                DP(i, cost, count) = maximum of:
                    DP(i-1, cost, count)
                        AND
                    DP(i-1, cost – price, count – quantity) + value
In other words, each state is described according to the current index, total cost, and total count. For each pair of valid cost and count values, the current result for an item at index i will be equal either to the maximum subset sum that was found for the same values of cost and count at index i – 1 (that is, DP[i – 1][cost][count]) or the sum of the current item's value with the maximum sum at index i – 1 with cost and count equal to what they would have been prior to including the item (that is, DP[i - 1][cost – price][count – quantity] + value).
We can code the preceding logic as follows:
vector<vector<vector<int>>> DP(N + 1, vector<vector<int>>(budget + 1, vector<int>(capacity + 1, 0)));
for(int i = 1; i <= N; i++)
{
    Product product = products[i-1];

for(int cost = 0; cost <= budget; cost++)
{
        for(int count = 0; count <= capacity; count++)
        {
            if(cost < product.price || count < product.quantity)
            {
                DP[i][cost][count] = DP[i-1][cost][count];
            }
            else
            {
                DP[i][cost][count] = max
                (
                    DP[i-1][cost][count],
                    DP[i-1][cost – product.price][count – product.quantity] + product.value
                );
            }
        }
}
cout << DP[N][budget][capacity] << endl;
}

As you can see, the implementation is equivalent to the 0-1 knapsack solution with an additional dimension.

Activity 23: Residential Roads

This activity has quite a few potential pitfalls if you do not approach it with some forethought. The most difficult aspect of it is the fact that it requires a number of distinct steps, and a careless mistake at any point can cause the entire program to fail. Therefore, it is recommended to approach the implementation step by step. The primary steps that are required are as follows:

Handling the input
Building the graph (finding adjacencies and weight values)
Finding the shortest distances between graph nodes
Reconstructing the edges in the shortest paths
Redrawing the input grid

Since this is considerably lengthier than the other activities in this chapter, let's attack each of these steps individually.

Step 0: Preliminary Setup

Before we write any code related to input, we should decide how we want to represent our data in advance. The input we will receive is as follows:

Two integers, H and W, representing the height and width of the grid.
An integer, N, representing the number of houses contained on the property.
H strings of width W representing the map of the property. We can store this data as an H-element vector of strings.
H rows of W integers representing the ruggedness of the terrain. We can store these values in an integer matrix.
N lines containing two integers, x and y, representing the coordinates of each house. For this, we can create a simple structure called Point containing two integers, x and y.

Now, let's look at the implementation:

Include the required headers and define some global constants and variables that we will need later in this problem. We will declare most of our data globally for the sake of convenience, but it is worth reiterating the point that this is generally considered bad practice within the context of a full-scale application:
#include <iostream>
#include <vector>
using namespace std;
const int UNKNOWN = 1e9;
const char EMPTY_SPACE = '.';
const string roads = "-|/\\";
struct Point
{
    int x;
    int y;
    Point(){}
    Point(int x, int y) : x(x), y(y) {}
};
int N;
int H, W;
vector<string> grid;
vector<vector<int>> terrain;
vector<vector<int>> cost;
vector<Point> houses;
Step 1: Handling the Input
Since there is a fair amount of input required for this problem, let's contain it all in its own function, Input(), which will return void:
void Input()
{
    cin >> H >> W;
    cin >> N;
    grid.resize(H);
    houses.resize(N);
    terrain.resize(H, vector<int>(W, UNKNOWN));    cost.resize(H, vector<int>(W, UNKNOWN));
    // Map of property
    for(auto &row : grid) cin >> row;
    // Terrain ruggedness
    for(int I = 0; i < H; i++)
    {
        for(int j = 0; j < W; j++)
        {
            cin >> terrain[i][j];
        }
    }
    // House coordinates
    for(int i = 0; i < N; i++)
    {
        cin >> houses[i].x >> house[i].y;
        // Set house labels in grid
        grid[houses[i].y][houses[i].x] = char(i + 'A');
    }
}
Step 2: Building the Graph
The problem description states the following:
- A road can be built between two houses if and only if there is a direct horizontal, vertical, or diagonal path between them.
- Roads may not be built across bodies of water, mountains, forests, and so on.
- The cost of building a road between two houses is equal to the sum of ruggedness values on the path between them.

To test the first condition, we simply need to compare the coordinates of two points and determine whether any of the following three conditions are true:

A.x = B.x (there is a horizontal line between them)
A.y = B.y (there is a vertical line between them)
| A.x – B.x | = | A.y – B.y | (there is a diagonal line between them)

Now, let's get back to our code.

To do this, let's write a function DirectLine(), that takes two points, a and b, as arguments and returns a Boolean:
bool DirectLine(Point a, Point b)
{
return a.x == b.x || a.y == b.y || abs(a.x – b.x) == abs(a.y – b.y);
}
To handle the second and third cases, we can simply perform a linear traversal from point a to point b in the grid. As we consider each point in the grid, we can accumulate the sum of values contained in the terrain matrix. As we do this, we can simultaneously check the character in grid[a.y][a.x], terminating it as soon as we encounter a character that is not equal to EMPTY_SPACE (that is, '.'). If at the end of the traversal point a is equal to point b, we will store the sum we acquired in the cost matrix; otherwise, we have determined that there is no adjacency between a and b, in which case we return UNKNOWN. We can do this using the GetCost() function, which takes two integers, start and end, as arguments. These represent the indices of a and b, respectively, and return an integer:
int GetCost(int start, int end)
{
    Point a = houses[start];
    Point b = houses[end];
    // The values by which the coordinates change on each iteration
    int x_dir = 0;
    int y_dir = 0;
    if(a.x != b.x)
    {
        x_dir = (a.x < b.x) ? 1 : -1;
    }
    if(a.y != b.y)
    {
        y_dir = (a.y < b.y) ? 1 : -1;
    }
    int cost = 0;

    do
    {
        a.x += x_dir;
        a.y += y_dir;
        cost += terrain[a.y][a.x];
    }
    while(grid[a.y][a.x] == '.');
    return (a != b) ? UNKNOWN : res;
}
The final line requires that we define operator != in our Point struct:
struct Point
{
......
bool operator !=(const Point &other) const { return x != other.x || y != other.y; }
}
Now, let's create the following GetAdjacencies() function:
void GetAdjacencies()
{
    for(int i = 0; i < N; i++)
    {
        for(int j = 0; j < N; j++)
        {
            if(DirectLine(houses[i], houses[j])
            {
                cost[i][j] = cost[j][i] = GetCost(i, j);
            }
        }
    }
}
Step 3: Finding the Shortest Distances between Nodes
The problem states that two houses should be connected by a road that is on the path that minimizes the cost of reaching the exit point. For this implementation, we will use the Floyd-Warshall algorithm. Let's get back to our code:
Let's define a function, GetShortestPaths(), that will handle both the implementation of Floyd-Warshall as well as the path's reconstruction. To handle the latter case, we will maintain a N x N integer matrix called next that will store the index of the next point on the shortest path from nodes a and b. Initially, its values will be set to the existing edges in the graph:
void GetShortestPaths()
{
    vector<vector<int>> dist(N, vector<int>(N, UNKNOWN));
    vector<vector<int>> next(N, vector<int>(N, UNKNOWN));
for(int i = 0; i < N; i++)
{
    for(int j = 0; j < N; j++)
    {
        dist[i][j] = cost[i][j]
        if(dist[i][j] != UNKNOWN)
        {
            next[i][j] = j;
        }
    }
    dist[i][j] = 0;
    next[i][i] = i;
}
...
}
We will then perform the standard implementation of Floyd-Warshall, with one additional line in the innermost loop setting next[start][end] to next[start][mid] every time we find a shorter distance between start and end:
for(int mid = 0; mid < N; mid++)
{
    for(int start = 0; start < N; start++)
    {
        for(int end = 0; end < N; end++)
        {
            if(dist[start][end] > dist[start][mid] + dist[mid][end])
            {
                dist[start][end] = dist[start][mid] + dist[mid][end];
                next[start][end] = next[start][mid];
            }
        }
    }
}
Step 4: Reconstructing the Path
With the data that we obtained in the next matrix, we can easily reconstruct the points on each path in a similar way to the reconstruction approaches for the LCS or 0-1 Knapsack problems. For this purpose, we will define another function, GetPath(), that has three parameters—two integers, start and end, and a reference to the next matrix — and returns an integer vector containing the node indices of the path:
vector<int> GetPath(int start, int end, vector<vector<int>> &next)
{
    vector<int> path = { start };
    do
    {
        start = next[start][end];
        path.push_back(start);
    }
    while(next[start][end] != end);
    return path;
}
Returning to GetShortestPaths(), we will now add a loop underneath our implementation of Floyd-Warshall that calls GetPath() and then draws lines in the grid corresponding to each pair of points in the path:
for(int i = 0; i < N; i++)
{
    auto path = GetPath(i, N – 1, next);

    int curr = i;
    for(auto neighbor : path)
    {
        DrawPath(curr, neighbor);
        curr = neighbor;
    }
}
Step 5: Redrawing the Grid
Now, we must draw the roads in the grid. We will do this in another function, DrawPath(), which has the start and end parameters:
void DrawPath(int start, int end)
{
    Point a = houses[start];
    Point b = houses[end];
    int x_dir = 0;
    int y_dir = 0;
    if(a.x != b.x)
    {
        x_dir = (a.x < b.x) 1 : -1;
    }
    if(a.y != b.y)
    {
        y_dir = (a.y < b.y) 1 : -1;
    }

    ……
}
We will need to choose the correct character corresponding to the orientation of each road. To do this, we will define a function, GetDirection(), that returns an integer corresponding to an index in the roads string we defined at the beginning ("-|/\"):
int GetDirection(int x_dir, int y_dir)
{
    if(y_dir == 0) return 0;
    if(x_dir == 0) return 1;
    if(x_dir == -1)
    {
        return (y_dir == 1) ? 2 : 3;
    }
    return (y_dir == 1) ? 3 : 2;
}
void DrawPath(int start, int end)
{
    ……
    int direction = GetDirection(x_dir, y_dir);
    char mark = roads[direction];
        ……
}
We can now perform a linear traversal from a to b, setting each cell in the grid to mark if its value is EMPTY_SPACE. Otherwise, we must check to see whether the character in the cell is a road character of a different orientation, in which case we set it to +:
do
{
    a.x += x_dir;
    a.y += y_dir;

    if(grid[a.y][a.x] == EMPTY_SPACE)
    {
        grid[a.y][a.x] = mark;
    }
    else if(!isalpha(grid[a.y][a.x]))
    {
            // If two roads of differing orientations intersect, replace symbol with '+'
            grid[a.y][a.x] = (mark != grid[a.y][a.x]) ? '+' : mark;
    }
}
while(a != b);
All that is left is to call our functions in main() and print the output:
int main()
{
        Input();
        BuildGraph();
        GetShortestPaths();

        for(auto it : grid)
        {
            cout << it << endl;
        }
        return 0;
}

C++ Data Structures and Algorithm Design Principles

By : John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

C++ Data Structures and Algorithm Design Principles

By: John Carey, Anil Achary, Shreyans Doshi, Payas Rajan

Overview of this book

Related Content you might be interested in

Current Title:

C++ Data Structures and Algorithm Design Principles

Beginning Java Data Structures and Algorithms

PHP 7 Data Structures and Algorithms

Hands-On Data Structures and Algorithms with Python – Third Edition

Chapter 9: Dynamic Programming II

Activity 22: Maximizing Profit

Activity 23: Residential Roads