Book Image

Rust Standard Library Cookbook

By : Jan Hohenheim, Daniel Durante
Book Image

Rust Standard Library Cookbook

By: Jan Hohenheim, Daniel Durante

Overview of this book

Mozilla’s Rust is gaining much attention with amazing features and a powerful library. This book will take you through varied recipes to teach you how to leverage the Standard library to implement efficient solutions. The book begins with a brief look at the basic modules of the Standard library and collections. From here, the recipes will cover packages that support file/directory handling and interaction through parsing. You will learn about packages related to advanced data structures, error handling, and networking. You will also learn to work with futures and experimental nightly features. The book also covers the most relevant external crates in Rust. By the end of the book, you will be proficient at using the Rust Standard library.
Table of Contents (12 chapters)

How it works...

You can construct a regex object by calling Regex::new() with a valid regex string[7]. Most of the time, you will want to pass a raw string in the form of r"...". Raw means that all symbols in the string are taken at literal value without being escaped. This is important because of the backslash (\) character that is used in regex to represent a couple of important concepts, such as digits(\d) or whitespace (\s). However, Rust already uses the backslash to escape special non-printable symbols, such as the newline (\n) or the tab (\t)[23]. If we wanted to use a backslash in a normal string, we would have to escape it by repeating it ( \\). Or the regex on line [14] would have to be rewritten as:

"(\\d{2}).(\\d{2}).(\\d{4})"

Worse yet, if we wanted to match for the backslash itself, we would have to escape it as well because of regex. With normal strings, we would have to quadruple-escape it! ( \\\\)
We can save ourselves the headache of missing readability and confusion by using raw strings and write our regex normally. In fact, it is considered good style to use raw strings in every regex, even when it doesn't have any backslashes [33]. This is a help for your future self if you notice down the line that you actually would like to use a feature that requires a backslash.

We can iterate over the results of our regex [18]. The object we get on every match is a collection of our capture groups. Keep in mind that the zeroeth index is always the entire capture [19]. The first index is then the string from our first capture group, the second index is the string of the second capture group, and so on. [20]. Unfortunately, we do not get a compile-time check on our index, so if we accessed &cap[4], our program would compile but then crash during runtime.

When replacing, we follow the same concept: $0 is the entire match, $1 the result of the first capture group, and so on. To make our life easier, we can give the capture groups names by starting them with ?P<somename>[29] and then use this name when replacing [31].

There are many flags that you can specify, in the form of (?flag), for fine-tuning, such as i, which makes the match case insensitive [33], or x, which ignores whitespace in the regex string. If you want to read up on them, visit their documentation (https://doc.rust-lang.org/regex/regex/index.html). Most of the time though, you can get the same result by using the RegexBuilder that is also in the regex crate [36]. Both of the rust_regex objects we generate in lines [33] and [36] are equivalent. While the second version is definitely more verbose, it is also way easier to understand at first glance.