☑ Uncovering Rust: Errors and Methods

23 Apr 2023 at 1:00PM in Software
 | 
Photo by Nick Jio on Unsplash
 | 

Rust is fairly new multi-paradigm system programming language that claims to offer both high performance and strong safety guarantees, particularly around concurrency and memory allocation. As I play with the language a little, I’m using this series of blog posts to discuss some of its more unique features as I come across them. This one discusses error handling and how to associate methods with data types.

This is the 4th of the 7 articles that currently make up the “Uncovering Rust” series.

rusty boat2

In this article we talk about a few areas to which the previous articles have referred but not covered in detail. Firstly we’ll look at how errors are raised and handled in Rust, which is a somewhat different to many other languages. Secondly, we’ll look at how Rust associates methods with data types, to provide basic object orientation features.

Dealing with Errors

Many modern languages use some variant of exception handling to deal with unexpected events. This is convenient for the developer raising the error, as they need put little thought into which errors the user of the code will care about — they just raise appropriate exceptions and let others handle them as necessary. In a number of recent languages this approach has fallen out of favour, however, and returning explicit error values, as in the old C days, has made something of a comeback. Rust is one of those languages, another being Go.

That said, it’s not always appropriate to return a value in the case of an error/ For example, the semantics of assert() don’t require you to check a return type and for good reason — simple runtime correctness checks, which are often only enabled in pre-production builds, shouldn’t incur that sort of effort.

As a result, Rust splits errors into unrecoverable and recoverable. Let’s look at both cases in turn.

Unrecoverable Errors: Panic Stations

Some errors are regarded as clear programming errors by Rust, and in these cases the panic!() macro can be used to halt execution immediately. Examples of this include going beyond the end of an array, or attempting to split a UTF-8 string in the middle of a code point as we saw in the previous article. It is the developer’s responsibility to ensure these issues never happen in production code.

By default, a panic will print an error message, unwind and clean up the stack, then quit. The error message is formatted in the same way as for format!() and println!(), which can be helpful.

To see this in action, consider the following absurdly simple example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn some_other_function() {
    panic!("Something went wrong");
}

fn some_function() {
    some_other_function();
}

fn main() {
    some_function();
}

If we run this, we get the following simple output:

$ target/debug/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

If we follow the advice and run with RUST_BACKTRACE=1 then we get a helpful backtrace as well:

$ RUST_BACKTRACE=1 target/debug/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/panicking.rs:64:14
   2: error_handling::some_other_function
             at ./src/main.rs:2:5
   3: error_handling::some_function
             at ./src/main.rs:6:5
   4: error_handling::main
             at ./src/main.rs:10:5
   5: core::ops::function::FnOnce::call_once
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

This is what’s known as a “short” backtrace, as it omits some internal functions and some details such as the memory address of functions. In case you’re curious, specifying RUST_BACKTRACE=full, as suggested in that final message in the output above, will show you the full backtrace. I’ll leave that as an exercise to the reader, since it takes up over sixty very long lines — however, if you’re into deep language details, you might find it gives you some interesting insight into the actions of Rust’s _main()1 and stack unwinding functionality.

Also, you need to have debug symbols included to get this level of detail in the backtrace — if we instead make a release build of the following code then we get a lot less:

$ RUST_BACKTRACE=1 target/release/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/panicking.rs:64:14
   2: error_handling::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

As an aside, if you’re in a very resource-constrained environment, you might not want the overhead of walking back the stack, so you can instead specify panic = 'abort' in your Cargo.toml file to skip it and rely on the operating system to clean up the memory used by the process. It’s also possible to conditionally compile different code based on this setting, if you need that sort of thing for some reason.

As a further aside, there are mechanisms to catch panics, such as std::panic::catch_unwind(). However, the documentation cautions against using this as a general-purpose error handling mechanism — you don’t generally want to be working against the language in this sort of way. If you really want to use exceptions as your error-handling mechanism, Rust simply isn’t the language for you. Instead of drilling into that, let’s look at how to return and deal with recoverable errors.

Recoverable Errors: What a Result

In production code, your use of panics will be rare and constrained to cases where restarting the entirely application is genuinely the only way to safely proceed. So what’s the idiomatic way to indicate a recoverable error? Well, we already saw the Option enum in the second article, which represents a value or None for functions which can either return something or not. There’s another enum Result which can be used for functions which can fail — it’s defined as:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

It has two possible values, Ok and Err which are templated on different types — T is the type of value which is to be returned in the successful case, and E is the type which indicates the type of error. Like Option, Result is brought into scope automatically, so you don’t need to specify Result::Ok and Result::Err.

An example of this is opening a file, which returns a std::fs::File on success or a std::io::ErrorKind on failure. Let’s see a very simple example of it in use:

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let file_descriptor_result = File::open("filename.txt");

    let _file_descriptor = match file_descriptor_result {
        Ok(file) => file,
        Err(error) => match error.kind() {
            ErrorKind::NotFound =>
                panic!("File does not exist"),
            other_error =>
                panic!("Problem opening the file: {:?}", other_error),
        },
    };
}

In this example, file_descriptor_result is of type Result<File, ErrorKind> and so we unwrap it with the outer match and return the File from the match block to be assigned to _file_descriptor2. In the error case, however, we use an additional nested match to unwrap ErrorKind and handle NotFound as a special case. In both error cases here we just panic!() for simplicity, but of course in reality you’d likely do something more sensible.

Return vs Exception

The problem with error returns has traditionally been their verbosity over exception-based mechanisms — this is because the return value of the function must often be overloaded to specify both a return value and an error code, such as using negative integers for error codes; or the result of the operation needs to be passed out by reference using a parameter, which is both confusing and irritating. The use of Result with match, however, means that both success and error cases can be easily stored in the same type, and handled comparatively compactly without disturbing the flow of the function too much.

Whether this is overall better or worse than exception-based systems depends, in my view, on the likelihood that you’re going to want to handle the errors. If your logic is that you want to execute a lot of statements, and only handle errors very generically outside the whole sequence of instructions, then exception-based code is hard to beat — your whole function is wrapped in a single trycatch (or tryexcept in Python) then it’s hard to beat for readability.

However, it’s quite common that you’d like to handle particular error cases differently, and in these cases your list of operations is often broken up into lots of more granular trycatch blocks — in these cases, Rust’s style of error handling has the edge in readability, in my view.

How common these cases are depends partly on what your code is doing, but also depends on your philosophy — do you believe code should be written to be internally resilient against error cases, retrying operations or creating missing files? Or fail fast and let some other system resolve the situation? This is a manner of opinion, although I will say from experience that exception-based mechanisms give you a gentle push towards handling lots of errors in the same way, and this isn’t always ideal — sometimes it can mean systems aren’t as robust as they could have been with more intelligent self-correction mechanisms, and sometimes it just means the error messages in your log file are unhelpfully unspecific. So fine-grained error-handling definitely has its advantages, even if it’s a little more up-front effort sometimes.

Turning a Result into a Panic

Result also offers a couple of handle methods called unwrap() and expect(), which can simplify things further. The unwrap() method is a shortcut for “return the result in the success case, and panic!() in the error case — this allows your code to trivially make a recoverable error into an unrecoverable one, if that makes sense in the context. The expect() method does the same, but it takes an error message as a parameter which will be passed to panic!() — this is almost certainly going to be the preferable option as it leads to better diagnostics.

let file_descriptor = File::open("filename.txt")
    .expect("filename.txt missing from install package");

Error Handling with Closures

As well as the plain unwrap() method to panic in the error case, there’s also unwrap_or_else() to trigger your own code. To use this you’ll need to create a closure — we’ll talk more about these in a later article, but they’re essentially inline anonymous functions that some languages call “lambdas”, and I thought it might be nice to see an example of its use here.

let mut file_descriptor = File::open("filename.txt")
    .unwrap_or_else(|error| {
        if error.kind() == ErrorKind::NotFound {
            std::fs::OpenOptions::new()
                .create(true).write(true).read(true).open("filename.txt")
                .expect("Failed to create filename.txt")
        } else {
            panic!("Failed to open filename.txt: {:?}", error);
        }
    });

The |error| {...} syntax declares a closure taking error as a single parameter, and the semantics of unwrap_or_else() are that in the error case, the error value is passed to that closure. If the error is NotFound then we try creating the file — if that succeeds, the value is returned from the closure, and hence then returned from unwrap_or_else() in the same way as the success case. If the file creation fails then the expect() will panic for us3.

I’m not sure what I think of this use of closures with regard to readability, but I’m going to reserve judgement on that until I’ve played with the language in real situations somewhat. It’s certainly a useful option to keep in mind for now.

Propagating Errors…?

When something goes wrong in an inner function, it’s quite a common pattern to want to pass the error back out again. This is easy enough to do by handling the error using one of the mechanisms above and using an explicit return statement to return it back out. This is such a common case, however, that there’s a shorthand for this — the ? operator.

Let’s start with an example of a function that does this without the ? operator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
use std::fs::File;
use std::io::{self, Read};

fn get_file_content(filename: &str) -> Result<String, io::Error> {
    let mut handle = match File::open(filename) {
        Ok(file) => file,
        Err(e) => return Err(e),
    };

    let mut content = String::new();
    match handle.read_to_string(&mut content) {
        Ok(_) => Ok(content),
        Err(e) => Err(e),
    }
}

fn main() {
    let line = match get_file_content("filename.txt") {
        Ok(line) => line,
        Err(_) => String::from("The file was empty"),
    };
    println!("The content I got: {}", line);
}

Our function has two opportunities to return an io::Error: when opening the file, and when reading it. If you’re not used to the abbreviated return syntax yet, note that the second match in get_file_content() isn’t followed by a semicolon, so the return of this match becomes the return of the function as a whole.

From what we’ve seen in the articles so far, hopefully this is fairly comprehensible. Now let’s see how we can make the function more concise using ?.

4
5
6
7
8
9
fn get_file_content(filename: &str) -> Result<String, io::Error> {
    let mut handle = File::open(filename)?;
    let mut content = String::new();
    handle.read_to_string(&mut content)?;
    Ok(content)
}

When you compare the two versions, it’s pretty straightforward what this is doing — either return the successful result from the expression, or return the error result from the function as a whole. The operator is a little subtle, but it doesn’t take long to get used to spotting it, and because the semantics are so simple then the saving in verbosity comes at essentially zero cost. The convenience of it also means it actively encourages errors to be propagated to be properly handled, rather than silently swallowed just because it makes life easier for the programmer in that moment. Come on, we’ve all been there, telling ourselves that we’ll go back and put proper error handling in later — maybe sometimes you even did, but I’ll put money on the fact it wasn’t every time.

For further brevity, it’s not much of a stretch to chain these calls together, avoiding the need for the filehandle to be stored in a variable.

4
5
6
7
8
fn get_file_content(filename: &str) -> Result<String, io::Error> {
    let mut content = String::new();
    File::open(filename)?.read_to_string(&mut content)?;
    Ok(content)
}

As it happens, in this particular case Rust also provides a function fs::read_to_string(<filename>) which removes the need for this entire function — you can’t get more concise than that.

As you’d expect, you can only use ? in functions whose return type is compatible. You can also use it with Option<T> values, where the None is the early return as opposed to Err — once again, this can only be used in a function which itself returns Option<T>. One interesting wrinkle is that in Rust the main() function can be written to return Result<(), E> which allows the ? operator to be used there too. In this case, if main() returns Ok that will be translated to an exit status of 0, whereas Err will be translated to a non-zero exit status, in keeping with usual conventions4.

Methods

The other aspect I’d like to touch on in this article is how Rust associates data types and methods together. This is a core tenet of object orientation, and whilst Rust may not be an “object oriented language” by everyone’s definition, it certainly offers some of these principles. We’ll look at traits in a future article, which you can loosely think of as form of inheritance.

For now, however, let’s look at methods. Let’s take a class object-orientation example of shapes — we’d like to define a series of objects representing different shapes, and for each one we’d like a constructor to create one, and methods to return the total perimeter and area of the shape, as well as method to translate the entire shape by fixed offset. Let’s take Rectangle as an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
struct Point {
    x: f64,
    y: f64,
}

impl Point {
    fn translate(&mut self, x: f64, y: f64) {
        self.x += x;
        self.y += y;
    }
}

struct Rectangle {
    bottom_left: Point,
    top_right: Point
}

impl Rectangle {
    fn new(first_corner: &Point, second_corner: &Point) -> Rectangle {
        Rectangle { bottom_left: Point{
            x: first_corner.x.min(second_corner.x),
            y: first_corner.y.min(second_corner.y),
        }, top_right: Point {
            x: first_corner.x.max(second_corner.x),
            y: first_corner.y.max(second_corner.y),
        } }
    }
}

impl Rectangle {
    fn area(&self) -> f64 {
        (self.top_right.x - self.bottom_left.x) *
        (self.top_right.y - self.bottom_left.y)
    }

    fn perimeter(&self) -> f64 {
        (2.0 * (self.top_right.x - self.bottom_left.x)) +
        (2.0 * (self.top_right.y - self.bottom_left.y))
    }

    fn translate(&mut self, x: f64, y: f64) {
        self.bottom_left.translate(x, y);
        self.top_right.translate(x, y);
    }
}

fn main() {
    let mut rect = Rectangle::new(
        &Point{x: 10.0, y: 20.0},
        &Point{x: 50.0, y: 100.0});
    println!("Coordinates: ({}, {}) - ({}, {})",
             rect.bottom_left.x, rect.bottom_left.y,
             rect.top_right.x, rect.top_right.y);
    println!("Area: {}", rect.area());
    println!("Perimeter: {}", rect.perimeter());
    rect.translate(5.0, -15.0);
    println!("After translation: ({}, {}) - ({}, {})",
             rect.bottom_left.x, rect.bottom_left.y,
             rect.top_right.x, rect.top_right.y);
    println!("Area: {}", rect.area());
    println!("Perimeter: {}", rect.perimeter());
}

As you can see in this example, we use an impl block to define functions in the context of a structure. If you like, you can use multiple impl blocks for a single type, to build up methods from different parts of code — in this example that’s not particularly useful, but we’ll see where that becomes important in a later article on generics and traits.

Within an impl block we use standard syntax to declare functions, which are of two types:

Associated functions
These are functions are associated with a type, but don’t take an instance of one. In some other OOP languages these are known as static methods. The new() method above is one of these — in this case it’s used as a constructor, which normalises the two corner coordinates that are passed so they’re always refer to the bottom-left and top-right corners. The use of the name new in this way is a strong convention, but there’s nothing special about it to the language itself — it’s just another associated method.
Methods
Methods are functions which take an instance of the object as their first argument. The &self syntax here is just shorthand for self: &Self, where Self is always the type on which the method is being defined — the existence of that as the first parameter is what makes a function into a method, and this can only be done within the context of a type. The area() and perimeter() methods borrow an immutable reference to an instance of Rectangle, the same as const methods in C++, and translate() borrows a mutable reference because it needs to modify the instance.

That’s about it for methods at this stage. We’ll revisit them in a future when we discuss traits, but the basics are pretty straightforward, as we’ve seen.

Conclusions

I must confess to having mixed feelings about the error handling in Rust. My C programming days are quite a long time ago now, and I’ve been using primarily C++ and Python since then, which are both fairly exception-oriented languages. I’m well aware of the risks and overheads associated with exception-based mechanisms, and I can understand why a system-level language like Rust wouldn’t want to impose the overhead of stack unwinding on the developer. In general, I’m going to reserve judgement until I’ve used this more widely in real code, when I’ll be in a much better position to judge how much additional effort it is to make sure the errors are handled appropriately. That said, even if it is more effort, I think I will be more confident that my application will handle exceptional cases more appropriately than some top-level generic error handler ever could.

In terms of methods, I don’t have any strong feelings, to be honest. The impl {} syntax neither inspires nor repulses me, it seems like a reasonable enough solution. The fact that methods can be added to a type away from the definition potentially makes it a little harder to consider the interface of a given object, but the fact that the language allows it doesn’t prevent developers from adopting conventions like defining everything for a type together, so I don’t regard this as a major flaw. Once again, however, I’ll reserve judgement until I’ve structured a larger codebase and seen how this works out in practice.

That’s it for this article, I hope it’s been useful and/or interesting. I’m hoping to have another one written up soon, as I’d like to finish the sweep of what I consider to be the important Rust features so I’m in a better position to start planning larger projects in it.


  1. Rust has the same convention as C and C++ (and very possibly others) in that _main() is the “real” main() function which is invoked to start the executable. It does some basic setup to prepare the language environment before invoking your own main() function. 

  2. The leading underscore just suppresses a warning about it being an unused variable. 

  3. If you’re wondering about all that OpenOptions::new() stuff, that was the only way I could find to create a new file which was already open in read/write mode — File::create() is available, but opens in write-only mode. There’s a new function File::create_new() which does it, but that’s not yet made it into a release version of the language at time of writing. 

  4. If you want more control over the specific main() exit status, you can instead have main() return std::process::ExitCode

The next article in the “Uncovering Rust” series is Uncovering Rust: Traits and Generics
Tue 25 Apr, 2023
23 Apr 2023 at 1:00PM in Software
 | 
Photo by Nick Jio on Unsplash
 |