Rust is fairly new multi-paradigm system programming language that claims to offer both high performance and strong safety guarantees, particularly around concurrency and memory allocation. As I play with the language a little, I’m using this series of blog posts to discuss some of its more unique features as I come across them. This one discusses error handling and how to associate methods with data types.
This is the 4th of the 7 articles that currently make up the “Uncovering Rust” series.
In this article we talk about a few areas to which the previous articles have referred but not covered in detail. Firstly we’ll look at how errors are raised and handled in Rust, which is a somewhat different to many other languages. Secondly, we’ll look at how Rust associates methods with data types, to provide basic object orientation features.
Many modern languages use some variant of exception handling to deal with unexpected events. This is convenient for the developer raising the error, as they need put little thought into which errors the user of the code will care about — they just raise appropriate exceptions and let others handle them as necessary. In a number of recent languages this approach has fallen out of favour, however, and returning explicit error values, as in the old C days, has made something of a comeback. Rust is one of those languages, another being Go.
That said, it’s not always appropriate to return a value in the case of an error/ For example, the semantics of assert()
don’t require you to check a return type and for good reason — simple runtime correctness checks, which are often only enabled in pre-production builds, shouldn’t incur that sort of effort.
As a result, Rust splits errors into unrecoverable and recoverable. Let’s look at both cases in turn.
Some errors are regarded as clear programming errors by Rust, and in these cases the panic!()
macro can be used to halt execution immediately. Examples of this include going beyond the end of an array, or attempting to split a UTF-8 string in the middle of a code point as we saw in the previous article. It is the developer’s responsibility to ensure these issues never happen in production code.
By default, a panic will print an error message, unwind and clean up the stack, then quit. The error message is formatted in the same way as for format!()
and println!()
, which can be helpful.
To see this in action, consider the following absurdly simple example:
1 2 3 4 5 6 7 8 9 10 11 |
|
If we run this, we get the following simple output:
$ target/debug/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
If we follow the advice and run with RUST_BACKTRACE=1
then we get a helpful backtrace as well:
$ RUST_BACKTRACE=1 target/debug/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
stack backtrace:
0: rust_begin_unwind
at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/panicking.rs:64:14
2: error_handling::some_other_function
at ./src/main.rs:2:5
3: error_handling::some_function
at ./src/main.rs:6:5
4: error_handling::main
at ./src/main.rs:10:5
5: core::ops::function::FnOnce::call_once
at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
This is what’s known as a “short” backtrace, as it omits some internal functions and some details such as the memory address of functions. In case you’re curious, specifying RUST_BACKTRACE=full
, as suggested in that final message in the output above, will show you the full backtrace. I’ll leave that as an exercise to the reader, since it takes up over sixty very long lines — however, if you’re into deep language details, you might find it gives you some interesting insight into the actions of Rust’s _main()
1 and stack unwinding functionality.
Also, you need to have debug symbols included to get this level of detail in the backtrace — if we instead make a release build of the following code then we get a lot less:
$ RUST_BACKTRACE=1 target/release/error-handling
thread 'main' panicked at 'Something went wrong', src/main.rs:2:5
stack backtrace:
0: rust_begin_unwind
at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/9eb3afe9ebe9c7d2b84b71002d44f4a0edac95e0/library/core/src/panicking.rs:64:14
2: error_handling::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
As an aside, if you’re in a very resource-constrained environment, you might not want the overhead of walking back the stack, so you can instead specify panic = 'abort'
in your Cargo.toml
file to skip it and rely on the operating system to clean up the memory used by the process. It’s also possible to conditionally compile different code based on this setting, if you need that sort of thing for some reason.
As a further aside, there are mechanisms to catch panics, such as std::panic::catch_unwind()
. However, the documentation cautions against using this as a general-purpose error handling mechanism — you don’t generally want to be working against the language in this sort of way. If you really want to use exceptions as your error-handling mechanism, Rust simply isn’t the language for you. Instead of drilling into that, let’s look at how to return and deal with recoverable errors.
In production code, your use of panics will be rare and constrained to cases where restarting the entirely application is genuinely the only way to safely proceed. So what’s the idiomatic way to indicate a recoverable error? Well, we already saw the Option
enum in the second article, which represents a value or None
for functions which can either return something or not. There’s another enum Result
which can be used for functions which can fail — it’s defined as:
enum Result<T, E> {
Ok(T),
Err(E),
}
It has two possible values, Ok
and Err
which are templated on different types — T
is the type of value which is to be returned in the successful case, and E
is the type which indicates the type of error. Like Option
, Result
is brought into scope automatically, so you don’t need to specify Result::Ok
and Result::Err
.
An example of this is opening a file, which returns a std::fs::File
on success or a std::io::ErrorKind
on failure. Let’s see a very simple example of it in use:
use std::fs::File;
use std::io::ErrorKind;
fn main() {
let file_descriptor_result = File::open("filename.txt");
let _file_descriptor = match file_descriptor_result {
Ok(file) => file,
Err(error) => match error.kind() {
ErrorKind::NotFound =>
panic!("File does not exist"),
other_error =>
panic!("Problem opening the file: {:?}", other_error),
},
};
}
In this example, file_descriptor_result
is of type Result<File, ErrorKind>
and so we unwrap it with the outer match
and return the File
from the match
block to be assigned to _file_descriptor
2. In the error case, however, we use an additional nested match
to unwrap ErrorKind
and handle NotFound
as a special case. In both error cases here we just panic!()
for simplicity, but of course in reality you’d likely do something more sensible.
The problem with error returns has traditionally been their verbosity over exception-based mechanisms — this is because the return value of the function must often be overloaded to specify both a return value and an error code, such as using negative integers for error codes; or the result of the operation needs to be passed out by reference using a parameter, which is both confusing and irritating. The use of Result
with match
, however, means that both success and error cases can be easily stored in the same type, and handled comparatively compactly without disturbing the flow of the function too much.
Whether this is overall better or worse than exception-based systems depends, in my view, on the likelihood that you’re going to want to handle the errors. If your logic is that you want to execute a lot of statements, and only handle errors very generically outside the whole sequence of instructions, then exception-based code is hard to beat — your whole function is wrapped in a single try
…catch
(or try
…except
in Python) then it’s hard to beat for readability.
However, it’s quite common that you’d like to handle particular error cases differently, and in these cases your list of operations is often broken up into lots of more granular try
…catch
blocks — in these cases, Rust’s style of error handling has the edge in readability, in my view.
How common these cases are depends partly on what your code is doing, but also depends on your philosophy — do you believe code should be written to be internally resilient against error cases, retrying operations or creating missing files? Or fail fast and let some other system resolve the situation? This is a manner of opinion, although I will say from experience that exception-based mechanisms give you a gentle push towards handling lots of errors in the same way, and this isn’t always ideal — sometimes it can mean systems aren’t as robust as they could have been with more intelligent self-correction mechanisms, and sometimes it just means the error messages in your log file are unhelpfully unspecific. So fine-grained error-handling definitely has its advantages, even if it’s a little more up-front effort sometimes.
Result
also offers a couple of handle methods called unwrap()
and expect()
, which can simplify things further. The unwrap()
method is a shortcut for “return the result in the success case, and panic!()
in the error case — this allows your code to trivially make a recoverable error into an unrecoverable one, if that makes sense in the context. The expect()
method does the same, but it takes an error message as a parameter which will be passed to panic!()
— this is almost certainly going to be the preferable option as it leads to better diagnostics.
let file_descriptor = File::open("filename.txt")
.expect("filename.txt missing from install package");
As well as the plain unwrap()
method to panic in the error case, there’s also unwrap_or_else()
to trigger your own code. To use this you’ll need to create a closure — we’ll talk more about these in a later article, but they’re essentially inline anonymous functions that some languages call “lambdas”, and I thought it might be nice to see an example of its use here.
let mut file_descriptor = File::open("filename.txt")
.unwrap_or_else(|error| {
if error.kind() == ErrorKind::NotFound {
std::fs::OpenOptions::new()
.create(true).write(true).read(true).open("filename.txt")
.expect("Failed to create filename.txt")
} else {
panic!("Failed to open filename.txt: {:?}", error);
}
});
The |error| {...}
syntax declares a closure taking error
as a single parameter, and the semantics of unwrap_or_else()
are that in the error case, the error value is passed to that closure. If the error is NotFound
then we try creating the file — if that succeeds, the value is returned from the closure, and hence then returned from unwrap_or_else()
in the same way as the success case. If the file creation fails then the expect()
will panic for us3.
I’m not sure what I think of this use of closures with regard to readability, but I’m going to reserve judgement on that until I’ve played with the language in real situations somewhat. It’s certainly a useful option to keep in mind for now.
When something goes wrong in an inner function, it’s quite a common pattern to want to pass the error back out again. This is easy enough to do by handling the error using one of the mechanisms above and using an explicit return
statement to return it back out. This is such a common case, however, that there’s a shorthand for this — the ?
operator.
Let’s start with an example of a function that does this without the ?
operator.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Our function has two opportunities to return an io::Error
: when opening the file, and when reading it. If you’re not used to the abbreviated return
syntax yet, note that the second match
in get_file_content()
isn’t followed by a semicolon, so the return of this match
becomes the return of the function as a whole.
From what we’ve seen in the articles so far, hopefully this is fairly comprehensible. Now let’s see how we can make the function more concise using ?
.
4 5 6 7 8 9 |
|
When you compare the two versions, it’s pretty straightforward what this is doing — either return the successful result from the expression, or return the error result from the function as a whole. The operator is a little subtle, but it doesn’t take long to get used to spotting it, and because the semantics are so simple then the saving in verbosity comes at essentially zero cost. The convenience of it also means it actively encourages errors to be propagated to be properly handled, rather than silently swallowed just because it makes life easier for the programmer in that moment. Come on, we’ve all been there, telling ourselves that we’ll go back and put proper error handling in later — maybe sometimes you even did, but I’ll put money on the fact it wasn’t every time.
For further brevity, it’s not much of a stretch to chain these calls together, avoiding the need for the filehandle to be stored in a variable.
4 5 6 7 8 |
|
As it happens, in this particular case Rust also provides a function fs::read_to_string(<filename>)
which removes the need for this entire function — you can’t get more concise than that.
As you’d expect, you can only use ?
in functions whose return type is compatible. You can also use it with Option<T>
values, where the None
is the early return as opposed to Err
— once again, this can only be used in a function which itself returns Option<T>
. One interesting wrinkle is that in Rust the main()
function can be written to return Result<(), E>
which allows the ?
operator to be used there too. In this case, if main()
returns Ok
that will be translated to an exit status of 0
, whereas Err
will be translated to a non-zero exit status, in keeping with usual conventions4.
The other aspect I’d like to touch on in this article is how Rust associates data types and methods together. This is a core tenet of object orientation, and whilst Rust may not be an “object oriented language” by everyone’s definition, it certainly offers some of these principles. We’ll look at traits in a future article, which you can loosely think of as form of inheritance.
For now, however, let’s look at methods. Let’s take a class object-orientation example of shapes — we’d like to define a series of objects representing different shapes, and for each one we’d like a constructor to create one, and methods to return the total perimeter and area of the shape, as well as method to translate the entire shape by fixed offset. Let’s take Rectangle
as an example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
As you can see in this example, we use an impl
block to define functions in the context of a structure. If you like, you can use multiple impl
blocks for a single type, to build up methods from different parts of code — in this example that’s not particularly useful, but we’ll see where that becomes important in a later article on generics and traits.
Within an impl
block we use standard syntax to declare functions, which are of two types:
new()
method above is one of these — in this case it’s used as a constructor, which normalises the two corner coordinates that are passed so they’re always refer to the bottom-left and top-right corners. The use of the name new
in this way is a strong convention, but there’s nothing special about it to the language itself — it’s just another associated method.&self
syntax here is just shorthand for self: &Self
, where Self
is always the type on which the method is being defined — the existence of that as the first parameter is what makes a function into a method, and this can only be done within the context of a type. The area()
and perimeter()
methods borrow an immutable reference to an instance of Rectangle
, the same as const
methods in C++, and translate()
borrows a mutable reference because it needs to modify the instance.That’s about it for methods at this stage. We’ll revisit them in a future when we discuss traits, but the basics are pretty straightforward, as we’ve seen.
I must confess to having mixed feelings about the error handling in Rust. My C programming days are quite a long time ago now, and I’ve been using primarily C++ and Python since then, which are both fairly exception-oriented languages. I’m well aware of the risks and overheads associated with exception-based mechanisms, and I can understand why a system-level language like Rust wouldn’t want to impose the overhead of stack unwinding on the developer. In general, I’m going to reserve judgement until I’ve used this more widely in real code, when I’ll be in a much better position to judge how much additional effort it is to make sure the errors are handled appropriately. That said, even if it is more effort, I think I will be more confident that my application will handle exceptional cases more appropriately than some top-level generic error handler ever could.
In terms of methods, I don’t have any strong feelings, to be honest. The impl {
… }
syntax neither inspires nor repulses me, it seems like a reasonable enough solution. The fact that methods can be added to a type away from the definition potentially makes it a little harder to consider the interface of a given object, but the fact that the language allows it doesn’t prevent developers from adopting conventions like defining everything for a type together, so I don’t regard this as a major flaw. Once again, however, I’ll reserve judgement until I’ve structured a larger codebase and seen how this works out in practice.
That’s it for this article, I hope it’s been useful and/or interesting. I’m hoping to have another one written up soon, as I’d like to finish the sweep of what I consider to be the important Rust features so I’m in a better position to start planning larger projects in it.
Rust has the same convention as C and C++ (and very possibly others) in that _main()
is the “real” main()
function which is invoked to start the executable. It does some basic setup to prepare the language environment before invoking your own main()
function. ↩
The leading underscore just suppresses a warning about it being an unused variable. ↩
If you’re wondering about all that OpenOptions::new()
stuff, that was the only way I could find to create a new file which was already open in read/write mode — File::create()
is available, but opens in write-only mode. There’s a new function File::create_new()
which does it, but that’s not yet made it into a release version of the language at time of writing. ↩
If you want more control over the specific main()
exit status, you can instead have main()
return std::process::ExitCode
. ↩