Book Image

Hands-On Functional Programming in Rust

By : Andrew Johnson
Book Image

Hands-On Functional Programming in Rust

By: Andrew Johnson

Overview of this book

Functional programming allows developers to divide programs into smaller, reusable components that ease the creation, testing, and maintenance of software as a whole. Combined with the power of Rust, you can develop robust and scalable applications that fulfill modern day software requirements. This book will help you discover all the Rust features that can be used to build software in a functional way. We begin with a brief comparison of the functional and object-oriented approach to different problems and patterns. We then quickly look at the patterns of control flow, data the abstractions of these unique to functional programming. The next part covers how to create functional apps in Rust; mutability and ownership, which are exclusive to Rust, are also discussed. Pure functions are examined next and you'll master closures, their various types, and currying. We also look at implementing concurrency through functional design principles and metaprogramming using macros. Finally, we look at best practices for debugging and optimization. By the end of the book, you will be familiar with the functional approach of programming and will be able to use these techniques on a daily basis.
Table of Contents (12 chapters)

Improving project architecture

Functional programs encourage good project architecture and principled design patterns. Using the building blocks of functional programming often reduces the number of design choices to be made in such a way that good options become obvious.

"There should be one - and preferably only one - obvious way to do it."
PEP 20

File hierarchy, modules, and namespace design

Rust programs are compiled primarily in one of two ways. The first is to use rustc to compile individual files. The second is to describe an entire package for compilation using cargo. We will assume here that projects are built using cargo, as follows:

  1. To start a package, you first create a Cargo.toml file in a directory. That directory will be your package directory from now on. This is a configuration file that will tell the compiler what code, assets, and extra information should be included into the package:
[package]
name = "fp_rust"
version = "0.0.1"
  1. After this basic configuration, you can now use cargo build to compile the entire project. Where you decide to place your code files, and what to name them, is determined by how you want to refer to them in the module namespace. Each file will be given its own module mod. You can also nest modules inside files:
mod inner_module
{
fn f1()
{
println!("inner module function");
}
}
  1. After these steps, projects can then be added as cargo dependencies, and namespaces can be used inside of modules to expose public symbols. Consider the following code snippet:
extern crate package;
use package::inner_module::f1;

These are the basic building blocks of Rust modules, but what does this have to do with functional programming?

Architecting a project in functional style is a process, and lends itself to certain routines. Typically, the project architect will start by designing the core data structures and in complex cases also the physical structure (where code/services will operationally be run). Once the data layout has been outlined in sufficient detail, then core functions/routines can be planned (such as how the program behaves). Up to this point, there may be code left unimplemented if coding is happening during the architecting stage. The final stage involves replacing this mock code with correct behaviors.

Following this stage-by-stage development process, we can also see an archetypical file layout forming. It is common to see these stages written top to bottom in actual programs. Though it is unlikely the authors went through planning in these explicit stages, it still is a common pattern due to simplicity's sake. Consider the following example:

//trait definitions

//data structure and trait implementations

//functions

//main

Grouping definitions like this may be helpful to standardize file layout and improve readability. Searching back and forth through a long file for symbol definitions is a common but unpleasant part of programming. It is also a preventable problem.

Functional design patterns

Aside from file layout, there are a number of functional design patterns that help reduce code weight and redundancy. When used properly, these principles can help clarify design decisions and also enable robust architecture. Most design patterns are variants of the single responsibility principle. This can take many forms depending on the context, but the intent is the same; write code that does one thing well, then reuse that code as needed. I have explained this as follows:

  • Pure functions: These are functions with no side effects or logical dependencies other than function arguments. A side effect is a change of state that affects anything outside of the function, aside from the return value. Pure functions are useful because they can be tossed around and combined and generally used carelessly without the risk of unintended effects.
The worst thing that can go wrong with a pure function is a bad return value or, in extreme cases, a stack overflow.

It is harder to cause bugs with pure functions, even when used recklessly. Consider the following example of pure functions, in intro_patterns.rs:

fn pure_function1(x: u32) -> u32
{
x * x
}

fn impure_function(x: u32) -> u32
{
println!("x = {}", x);
x * x
}
  • Immutability: Immutability is a pattern that helps encourage pure functions. Rust variable bindings are immutable by default. This is Rust's not-so-subtle way of encouraging you to avoid mutable state. Don't do it. If you absolutely must, it is possible to tag variables with the mut keyword to allow reassignment. This is shown with the following example, in intro_patterns.rs:
let immutable_v1 = 1;
//immutable_v1 = 2; //invalid

let mut mutable_v2 = 1;
mutable_v2 = 2;
  • Functional composition: Functional composition is a pattern where the output of one function is connected to the input of another function. In this fashion, functions can be chained together to create complex effects from simple steps. This is shown with the following code snippet, in intro_patterns.rs:
let fsin = |x: f64| x.sin();
let fabs = |x: f64| x.abs();

//feed output of one into the other
let transform = |x: f64| fabs(fsin(x));
  • Higher-order functions: These have already been mentioned before, but we haven't used the term yet. A HoF is a function that accepts a function as a parameter. Many iterator methods are HoFs. Consider the following example, in intro_patterns.rs:
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where P: FnMut(&Self::Item) -> bool
{ ... }
  • Functors: If you can get past the name, these are a simple and effective design pattern. They are also very versatile. The concept is somewhat difficult to capture in its entirety, but you may think of functors as the inverse of functions. A function defines a transformation, accepts data, and returns the result of the transformation. A functor defines data, accepts a function, and returns the result of the transformation. A common example of a functor is the bound map method that frequently appears on containers, such as for a Vec. Here is an example, in intro_patterns.rs:
let mut c = 0;
for _ in vec
!['a', 'b', 'c'].into_iter() .map(|letter| {
c += 1; (letter, c)
}){};
"A monad is a monoid in the category of endofunctors, what's the problem?"
Philip Wadler
  • Monads: Monads are a common stumbling block for people learning FP. Monads and functors are maybe the first words that you may encounter on a journey that goes deep into theoretical mathematics. We won't go there. For our purposes, monads are simply a trait with two methods. This is shown in the following code, in intro_patterns.rs:
trait Monad<A> {
fn return_(t: A) -> Self;
//:: A -> Monad<A>

fn bind<MB,B>(m: Self, f: Fn(A) -> MB) -> MB
where MB: Monad<B>;
//:: Monad<A> -> (A -> Monad<B>) -> Monad<B>
}

If that doesn't help clarify things (and it probably doesn't), a monad has two methods. The first method is the constructor. The second method lets you bind an operation to create another monad. Many common traits have hidden semi-monads but, by making the concept explicit, the concept becomes a strong design pattern instead of a messy anti-pattern. Don't try to reinvent what you don't have to.

  • Function currying: Function currying is a technique that may seem strange for anyone coming from a background in object-oriented or imperative languages. The reason for this confusion is that in many functional languages, functions are curried by default, whereas this is not the case for other languages. Rust functions are not curried by default.

The difference between curried and non-curried functions are that curried functions send in parameters one by one, whereas non-curried functions send in parameters all at once. Looking at a normal Rust function definition, we can see that it is not curried. Consider the following code, in intro_patterns.rs:

fn not_curried(p1: u32, p2: u32) -> u32
{
p1 + p2
}

fn main()
{
//and calling it
not_curried(1, 2);
}

A curried function takes each parameter one by one, as shown in the following, in intro_patterns.rs:

fn curried(p1: u32) -> Box<Fn(u32) -> u32>
{
Box::new(move |p2: u32| {
p1 + p2
})
}

fn main()
{
//and calling it
curried(1)(2);
}

Curried functions can be used as a function factory. The first few arguments configure how the final function should behave. The result is a pattern that allows shorthand configuration of complex operators. Currying complements all the other design patterns by converting individual functions into multiple components.

  • Lazy evaluation: Lazy evaluation is a pattern that is technically possible in other languages. However, it is uncommon to see it outside of FP, due to language barriers. The difference between a normal expression and a lazy expression is that a lazy expression will not be evaluated until accessed. Here is a simple implementation of laziness, implemented behind a function call in intro_patterns.rs:
let x = { println!("side effect"); 1 + 2 };

let y = ||{ println!("side effect"); 1 + 2 };

The second expression will not be evaluated until the function is called, at which point the code resolves. For lazy expressions, side effects happen at time of resolution instead of at initialization. This is a poor implementation of laziness, so we will go into further detail in later chapters. The pattern is fairly common, and some operators and data structures require laziness to work. A simple example of necessary laziness is a lazy list that may not otherwise be possible to create. The built-in Rust numerical iterator (lazy list) uses this well: (0..).

Memoization is the last pattern that we will introduce here. It may be considered as more of an optimization than design pattern, but due to how common it is, we should mention it here. A memoized function only computes unique results once. A simple implementation would be a function guarded by a hash table. If the parameters and result are already in the hash table, then skip the function call and directly return the result from the hash table. Otherwise, compute the result, put it in the hash table, and return. This process can be implemented manually in any language, but Rust macros allow us to write the memoization code once, and reuse that code by applying this macro. This is shown using the following code snippet, in intro_patterns.rs:

#[macro_use] extern crate cached;
#[macro_use] extern crate lazy_static;


cached! {
FIB;
fn fib(n: u64) -> u64 = {
if n==0 || n==1 { return n }
fib(n-1) + fib(n-2)
}
}

fn main()
{
fib(30);
}

This example makes use of two crates and many macros. We won't fully explain everything that is happening here until the very end of this book. There is much that is possible with macros and metaprogramming. Caching function results is just a start.

Metaprogramming

The term metaprogramming in Rust often overlaps with the term macros. There are two primary types of macros available in Rust:

  • Recursive
  • Procedural

Both types of macros take as input an abstract syntax tree (AST), and produce one or more AST.

A commonly used macro is println. A variable number of arguments and types are joined with the format string through the use of a macro to produce formatted output. To invoke recursive macros like this, invoke the macro just like a function with the addition of a ! before the arguments. Macro applications may alternatively be surrounded by [] or {}:

vec!["this is a macro", 1, 2];

Recursive macros are defined by macro_rules! statements. The inside of a macro_rules definition is very similar to that of a pattern-matching expression. The only difference is that macro_rules! matches syntax instead of data. We can use this format to define a reduced version of the vec macro. This is shown in the following code snippet, in intro_metaprogramming.rs:

macro_rules! my_vec_macro
{
( $( $x:expr ),* ) =>
{
{
let mut temp_vec = Vec::new();
$(
temp_vec.push($x);
)*
temp_vec
}
}
}

This definition accepts and matches only one pattern. It expects a comma-separated list of expressions. The syntax pattern ( $( $x: expr ),* ) matches against a comma-separated list of expressions and stores the result in the plural variable $x. In the body of the expression, there is a single block. The block defines a new vec, then iterates through $x* to push each $x into the vec, and, finally, the block returns the vec as its result. The macro and its expansion are as follows, in intro_metaprogramming.rs:

//this
my_vec_macro!(1, 2, 3);

//is the same as this
{
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
}

It is important to note that expressions are moved as code, not as values, so side effects will be moved to the evaluating context, not the defining context.

Recursive macro patterns match against token strings. It is possible to execute separate branches depending on which tokens are matched. A simple case match looks like the following, in intro_metaprogramming.rs:

macro_rules! my_macro_branch
{

(1 $e:expr) => (
println!("mode 1: {}", $e));
(2 $e:expr) => (
println!("mode 2: {}", $e));
}

fn main()
{
my_macro_branch!(1 "abc"
);
my_macro_branch!(2 "def");
}

The name recursive macros comes from recursion in the macros, so of course we can call into the macro that we are defining. Recursive macros could be a quick way to define a domain-specific language. Consider the following code snippet, in intro_metaprogramming.rs:

enum DSLTerm {
TVar { symbol: String },
TAbs { param: String, body: Box<DSLTerm> },
TApp { f: Box<DSLTerm>, x: Box<DSLTerm> }
}

macro_rules! dsl
{
( ( $($e:tt)* ) ) => (dsl!( $($e)* ));
( $e:ident ) => (DSLTerm::TVar {
symbol: stringify!($e).to_string()
});
( fn $p:ident . $b:tt ) => (DSLTerm::TAbs {
param: stringify!($p).to_string(),
body: Box::new(dsl!($b))
});
( $f:tt $x:tt ) => (DSLTerm::TApp {
f: Box::new(dsl!($f)),
x: Box::new(dsl!($x))
});
}

The second form of macro definitions is procedural macros. Recursive macros can be thought of as a nice syntax to help define procedural macros. Procedural macros, on the other hand, are the most general form. There are many things you can do with procedural macros that are simply impossible with the recursive form.

Here, we can grab the TypeName of a struct and use that to automatically generate a trait implementation. Here is the macro definition, in intro_metaprogramming.rs:

#![crate_type = "proc-macro"]
extern crate proc_macro;
extern crate syn;
#[macro_use]
extern crate quote;
use proc_macro::TokenStream;
#[proc_macro_derive(TypeName)]

pub fn type_name(input: TokenStream) -> TokenStream
{
// Parse token stream into input AST
let ast = syn::parse(input).unwrap();
// Generate output AST
impl_typename(&ast).into()
}

fn impl_typename(ast: &syn::DeriveInput) -> quote::Tokens
{
let name = &ast.ident;
quote!
{
impl TypeName for #name
{
fn typename() -> String
{
stringify!(#name).to_string()
}
}
}
}

The corresponding macro invocation looks like the following, in intro_metaprogramming.rs:

#[macro_use]
extern crate metaderive;

pub trait TypeName
{
fn typename() -> String;
}

#[derive(TypeName)]
struct MyStructA
{
a: u32,
b: f32
}

As you can see, procedural macros are a bit more complicated to set up. However, the benefit is then that all processing is done directly with normal Rust code. These macros permit use of any syntactic information in unstructured format to generate more code structures before compilation.

Procedural macros are handled as separate modules to be precompiled and executed during normal compiler execution. The information provided to each macro is localized, so
whole program consideration is not possible. However, the available local information is sufficient to achieve some fairly complicated effects.