Book Image

Hands-On Data Structures and Algorithms with Rust

By : Claus Matzinger
Book Image

Hands-On Data Structures and Algorithms with Rust

By: Claus Matzinger

Overview of this book

Rust has come a long way and is now utilized in several contexts. Its key strengths are its software infrastructure and resource-constrained applications, including desktop applications, servers, and performance-critical applications, not forgetting its importance in systems' programming. This book will be your guide as it takes you through implementing classic data structures and algorithms in Rust, helping you to get up and running as a confident Rust programmer. The book begins with an introduction to Rust data structures and algorithms, while also covering essential language constructs. You will learn how to store data using linked lists, arrays, stacks, and queues. You will also learn how to implement sorting and searching algorithms. You will learn how to attain high performance by implementing algorithms to string data types and implement hash structures in algorithm design. The book will examine algorithm analysis, including Brute Force algorithms, Greedy algorithms, Divide and Conquer algorithms, Dynamic Programming, and Backtracking. By the end of the book, you will have learned how to build components that are easy to understand, debug, and use in different applications.
Table of Contents (15 chapters)

Rust in 2018

How old is Rust? It started off in 2006 as a side project of Graydon Hoare, an engineer at Mozilla, and was later (in 2009) adopted by the company. Fast forward to less than a decade later to May 15, 2015, and the Rust team announced a stable version 1.0!

During its journey, there have been many features that have been added and removed again (for example, a garbage collector, classes, and interfaces) to help it become the fast and safe language that it is today.

Before getting deeper into borrowing and ownership, mutability, concurrency, safety, and so on in Rust, we would like to recap some major concepts in Rust and why they change architectural patterns significantly.

The 2018 edition

Rust in the 2015 edition is essentially the 1.0 version with a few non-breaking additions. Between 2015 and 2018, however, features and Requests for Comments (RFCs), Rust's way of changing core features with the community, accumulated, and worries about backward compatibility arose.

With the goal of keeping this compatibility, editions were introduced and, with the first additional edition, many major changes made it into the language:

  • Changes to the module path system
  • dyn Trait and impl Trait syntax
  • async/await syntax
  • Simplifications to the lifetime syntax

With these additions, Rust will introduce asynchronous programming into its syntax (async/await keywords) and improve the language's usability. This book uses the Rust 2018, released on December 6, 2018 (https://blog.rust-lang.org/2018/12/06/Rust-1.31-and-rust-2018.html) edition by default, so all the following snippets will already include these new language features!

The Rust language

Many of the established programming languages today are multi-paradigm languages, but still remain focused on the principles of object orientation. This means that they have classes, methods, interfaces, inheritance, and so on, none of which can be found in Rust, giving it a steep learning curve for many established developers.

More experienced readers will miss many aspects of what makes Rust an excellent language, such as static versus dynamic method invocation, memory layouts, and so on. I recognize the importance of those things, yet for brevity and focus chose to leave it to you to explore these things further. Check the Further reading section for resources.

As a multi-paradigm language, Rust has many functional concepts and paradigms that guide it, but they make traditional object-oriented patterns more difficult to apply. Other than organizing code without classes and interfaces, there are various methods to handle errors, change the code itself, or even work with raw pointers.

In the following sections, we want to explore a few concepts that make Rust unique and have a major influence on the way we develop algorithms and data structures.

Objects and behavior

Organizing code in Rust is a bit different from regular object-oriented languages such as C#. There, an object is supposed to change its own state, interfaces are simple contract definitions, and specialization is often modeled using class inheritance:

class Door {
private bool is_open = false;

public void Open() {
this.is_open = true;
}
}

With Rust, this pattern would require constant mutability of any Door instance (thereby requiring explicit locking for thread safety), and without inheritance GlassDoor would have to duplicate code, making it harder to maintain.

Instead, it's recommended to create traits to implement (shared) behavior. Traits have a lot in common with abstract classes in traditional languages (such as default implementations of methods/functions), yet any struct in Rust can (and should) implement several of those traits:

struct Door {
is_open: bool
}

impl Door {
fn new(is_open: bool) -> Door {
Door { is_open: is_open }
}
}

trait Openable {
fn open(&mut self);
}

impl Openable for Door {
fn open(&mut self) {
self.is_open = true;
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn open_door() {
let mut door = Door::new(false);
door.open();
assert!(door.is_open);
}
}

This pattern is very common in the standard library, and often third-party libraries will even add behavior to existing types by implementing traits in their code (also known as extension traits).

Other than a typical class, where data fields and methods are in a single construct, Rust emphasizes the separation between those by declaring a struct for data and an impl part for the methods/functions. Traits name and encapsulate behaviors so they can easily be imported, shared, and reused.

Going wrong

Other than classes, Rust comes without another well-known companion: null. In the absence of pointers and with a very different memory management model, there is no typical null pointer/reference.

Instead, the language works with Option and Result types that let developers model success or failure. In fact, there is no exception system either, so any failed execution of a function should be indicated in the return type. Only in rare cases when immediate termination is required does the language provide a macro for panicking: panic!().

Option<T> and Result<T, E> both encapsulate one (Option<T>) or two (Result<T, E>) values that can be returned to communicate an error or whether something was found or not. For example, a find() function could return Option<T>, whereas something like read_file() would typically have a Result<T, E> return type to communicate the content or errors:

fn find(needle: u16, haystack: Vec<u16>) -> Option<usize> {
// find the needle in the haystack
}

fn read_file(path: &str) -> Result<String, io::Error> {
// open the path as a file and read it
}

Handling those return values is often done with match or if let clauses in order to handle the cases of success or failure:

match find(2, vec![1,3,4,5]) {
Some(_) => println!("Found!"),
None => println!("Not found :(")
}

// another way
if let Some(result) = find(2, vec![1,2,3,4]) {
println!("Found!")
}

// similarly for results!
match read_file("/tmp/not/a/file") {
Ok(content) => println!(content),
Err(error) => println!("Oh no!")
}

This is due to Option<T> and Result<T, E> both being enumerations that have generic type parameters; they can assume any type in their variants. Matching on their variants provides access to their inner values and types to allow a branch of the code to be executed and handle the case accordingly. Not only does this eliminate the need for constructs such as try/catch with multiple—sometimes cast—exception arms, it makes failure part of the normal workflow that needs to be taken care of.

Macros

Another aspect of Rust is the ability to do metaprogramming—basically programming programming—using macros! Macros are expanded in Rust code before compilation, which gives them more power than a regular function. The generated code can, for instance, create functions on the fly or implement traits for a structure.

These pieces of code make everyday life a lot easier by reducing the need to create and then initialize vectors, deriving the ability to clone a structure, or simply printing stuff to the command line.

This is a simplified example for the declarative vec![] macro provided in the Rust Book (second edition, Appendix D):

#[macro_export] 
macro_rules! vec {
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$( temp_vec.push($x); )*
temp_vec
}
};
}

Declarative macros work on patterns and run code if that pattern matches; the previous example matches 0 - n expressions (for example, a number, or a function that returns a number) and inserts temp_vec.push(...) n times, iterating over the provided expressions as a parameter.

The second type, procedural macros, operate differently and are often used to provide a default trait implementation. In many code bases, the #[derive(Clone, Debug)] statement can be found on top of structures to implement the Clone and Debug traits automatically.

Later in this chapter, we are going to use a structure, FileName, to illustrate reference counting, but for printing it to the command line using the debug literal "{:?}", we need to derive Debug, which recursively prints all members to the command line:

#[derive(Debug)]
struct FileName {
name: Rc<String>,
ext: Rc<String>
}

The Rust standard library provides several macros already, and by creating custom macros, you can minimize the boilerplate code you have to write.

Unsafe

Rust's code is "safe" because the compiler checks and enforces certain behavior when it comes to memory access and management. However, sometimes these rules have to be forgone, making the code unsafe. unsafe is a keyword in Rust and declares a section of code that can do most of the things the C programming language would let you do. For example, it lets the user do the following (from the Rust Book, chapter 19.1):

  • Dereference a raw pointer
  • Call an unsafe function or method
  • Access or modify a mutable static variable
  • Implement an unsafe trait

These four abilities can be used for things such as very low-level device access, language interoperability (the compiler can't know what native libraries do with their memory), and so on. In most cases, and certainly in this book, unsafe is not required. In fact, the Rustonomicon (https://doc.rust-lang.org/nomicon/what-unsafe-does.html) defines a list of issues the language is trying to prevent from happening by providing the safe part:

  • Dereferencing null, dangling, or unaligned pointers.
  • Reading uninitialized memory.
  • Breaking the pointer aliasing rules.
  • Producing invalid primitive values:
    • Dangling/null references
    • Null fn pointers
    • A bool that isn't 0 or 1
    • An undefined enum discriminant
    • A char outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]
    • A non-UTF8 string
  • Unwinding into another language.
  • Causing a data race.

The fact that these potential issues are prevented in safe Rust certainly makes the life of a developer easier, especially when designing algorithms or data structures. As a consequence, this book will always work with safe Rust.