☑ Uncovering Rust: Build and Packaging

12 May 2023 at 9:26AM in Software
 | 
Photo by Pixabay on Pexels
 | 

Rust is fairly new multi-paradigm system programming language that claims to offer both high performance and strong safety guarantees, particularly around concurrency and memory allocation. As I play with the language a little, I’m using this series of blog posts to discuss some of its more unique features as I come across them. In this one I’m looking at Cargo, Rust’s build and packaging system.

This is the 6th of the 7 articles that currently make up the “Uncovering Rust” series.

cargo ship

So far in this series of articles I’ve been focusing on the language features themselves, as I think this is probably the most interesting aspect to people considering whether to investigate a language further and try using it in their own projects. In this article, however, I wanted to step back and look at the tooling around Rust — specifically Cargo, which is the tool that you’ll likely use most often.

Cargo is Rust’s package manager, but it also acts as a frontend to rustc which is the actual compiler. There’s nothing to stop you using these tools directly, of course, but I’d strongly advise starting out using Cargo first and only delve into the other tools as and when you need to.

Before we look at Cargo itself, though, first we’ll see how Rust projects are structured.

Packages, Crates and Modules

The unit at which you define your project is a package. There is a way that we can work with multiple packages at a time as well, using something called workspaces, but I’m going to cover that later in the article. For right now, let’s assume that a package represents a whole project.

A package consists of one or more crates, which are the units of code distribution. Each crate is either a binary crate, which provides executables and contains at least one main() function, or a library crate, which provides libraries that can be used by other code. A package may contain at most one library crate.

A crate corresponds to a target that the Rust compiler creates, but for the usual code structure issues can be subdivided across multiple modules. Often a module corresponds to a source file, although unlike Python the simple existence of a source file doesn’t create a module — each crate as a single root module and that must explicitly define or import the other modules that comprise the crate. Modules can be recursively defined down the tree as needed, and they can be declared inline within another source file instead of being split out separately.

I love a diagram, so here’s an illustration of the structure of a possible package.

Cargo Package Structure

Package Manifest

At the top-level of a package there’s a file Cargo.toml which is the manifest — it specifies where to find the source files to create the package, and what are the dependencies that are required to use it. Here’s a hypothetical example of such a file to give you an idea.

[package]
name = "andy-module"
version = "1.0.12"
edition = "2021"
authors = ["Andy Pearce <andy@andy-pearce.com>"]
categories = ["command-line-utilities"]
description = "Does something miraculous yet unknowable"
documentation = "https://www.andy-pearce.com/docs/andy-module/"
keywords = ["andy", "awesome"]
license = "MIT"
repository = "https://github.com/example/example"
rust-version = "1.68.2"

[dependencies]
log = { version = "0.4.4", optional = true }
regex = { git = "https://github.com/rust-lang/regex.git" }

[features]
default = ["application"]

I’m not going to run through what all the fields mean, but most of them are fairly self-explanatory. The important ones are name and version, which are mandatory — the rest you can look at in the Cargo manifest documentation if you want to.

Repository Structure

Compilation of a crate starts at a single source file known as the crate root — this is a source file which becomes the root module of the crate, and which contains references to the other modules that comprise it. Relative to the Cargo.toml file the default root module is src/main.rs for binary crates and src/lib.rs for library crates — if these are found, they’re assumed to be crates with the same name as the package. Additional binary crates can be added by creating entries in a src/bin directory, and any of this can be overridden by adding entries to Cargo.toml if you don’t like the defaults.

Compilation always starts from the crate root, and additional modules are included with the mod <name> directive. The contents of the module can be included inline with curly braces, or it can refer to another source file which defines the module.

// Inline module
mod foo {
    fn some_function() {
        ...
    }
}

// External module -- expect a `bar.rs` file
mod bar;

Modules can be nested in both cases — where modules are defined externally, a nested module is expected to be in a subdirectory with the same name as the parent module.

Putting this all together, we could have a repository that looked like this:

vectordraw/
├── Cargo.lock
├── Cargo.toml
└── src/
    ├── lib.rs
    ├── main.rs
    ├── renderers.rs
    ├── renderers/
    │   ├── screen.rs
    │   └── pdf.rs
    ├── shapes.rs
    ├── shapes/
    │   ├── rectangle.rs
    │   ├── circle.rs
    │   └── triangle.rs
    └── bin/
        ├── convertpng/
        │   ├── main.rs
        │   └── pngcanvas.rs
        └── convertpdf.rs

Let’s assume the package is named vectordraw in Cargo.toml, and let’s also assume that no further crates are defined explicitly in there. Cargo’s target autodiscovery conventions will therefore create the following crates:

Type Name Root
Library vectordraw src/lib.rs
Binary vectordraw src/main.rs
Binary convertpng src/bin/convertpng/main.rs
Binary convertpdf src/bin/convertpdf.rs

The output files of the binary crates will be named as you’d expect, and the library crate will generate a file libvectordraw.rlib — this is Rust’s equivalent of libfoo.a file in C/C++, used for static linking. It’s also possible to generate dynamic libraries1 if you want to, even those that can be used by other languages, but you can go read the documentation if you want to do that.

For completeness, here’s an equivalent of the earlier diagram updated to show the concrete names from this hypothetical vectordraw package.

vectordraw Package Structure

Compilation Process

At this point it’s easy to understand the process the compiler goes through to compile a crate:

  1. Start at the crate root module.
  2. Process the source file, adding any inline modules and making a list of external modules mentioned.
  3. For an external module abc the compiler looks for abc.rs in the same directory as the crate root2.
  4. Process each found module source file, adding any inline modules and making a list of external modules mentioned.
  5. For a submodule xyz of module abc, look for abc/xyz.rs at the same level as abc.rs.
  6. Recursively continue from step 4 until all modules have been found.

Modules and Encapsulation

Now we’ve looked at the structure of a repository and how packages are comprised of crates, and crates of modules, we’ll focus in on modules and their role in providing encapsulation.

Privacy

By default, everything defined within a module is private — that means it’s available to other code within the same module, or any of it’s descendent submodules, but won’t be available outside it. To change this you can just add the pub qualifier to the start of any definition to make it public outside of the module. The path to the module must also be public — for example, if a parent module is private then even a pub member of a descendant module won’t be visible externally. In this way it works very much like filesystem permissions (at least on Unix).

That’s about as simple as it gets, really, except there are a few wrinkles which it’s worth being aware of, which are listed below.

Structure Fields
If a struct is pub then the definition of the structure is made public, but not the fields within it — each of those needs its own pub specifier to be made visible outside of the module in which it’s defined. Note that it’s the entire module which has access to the private fields, not just any methods added with an impl block.
Methods
The impl block associated with a type always has the same visibility as the type itself, so you don’t write pub impl Type {}. However, each individual method is private by default unless explicitly marked pub.
Scope Specification
It’s possible to supply an argument to pub to specify the scope in which it should be public. For example, pub(crate) will always make an item public in the whole of the current crate, and pub(in <path>) makes the item public in a specified ancestor of the of the current module — pub(super) is a shorthand for the parent module. Note that these arguments can only ever further restrict access to the item, not grant access that would otherwise be denied.
Enumerations
Making an enum public also makes its variants public. I expected this could be overridden by adding pub(in self) to the variants, but I got an error about unnecessary visibility qualifier when I tried this. I don’t think this is an issue — I think that enumerations with private variants are asking for trouble, because external code wouldn’t be able to use match and be confident of catching all cases. But the error is a little odd — it seems like it’s being caught in the same case as a bare pub (which is unnecessary, as per the opening sentence of this paragraph).

Paths and Use

Now we’ve looked at controlling what’s visible to a module, now we consider how you refer to it.

Every module forms its own namespace which has an absolute path starting with crate. For example, if the crate root defines a module called firstmod, and this module defines a submodule called secondmod, and that defines a function myfunction() then the absolute path to the function would be crate::firstmod::secondmod::myfunction(). Starting the path with the literal crate always refers to an absolute path within the current crate. To refer to names in external crates, just replace crate with the name of that crate, and the rest of the path is interpreted in the same way.

It’s also possible to use a relative path which can start with:

  • self to refer to the current module.
  • super to refer to the immediate parent module.
  • The name of a module defined in the current module.

You can always use these full names to refer to items defined in other modules, but of course this can get tedious quite quickly. The use directive is used to pull items into the current namespace — specifying use one::two::three will pull three into the current namespace. If module three defines a function myfunction() then after that use directive, it can be referred to with three::myfunction().

A handy detail that’s worth remembering is there’s a shorthand for importing multiple things, using curty braces:

use std::{HashMap, VecDeque, BinaryHeap};

When using use with functions, it’s idiomatic Rust to pull in the parent namespace, so you still have a qualified function as a hint that it’s not defined locally. For types, however, then it’s more normal to just pull them in — an example would be use std::collections::HashMap. This raises the possibility of namespace conflicts, however — let’s say some third party module defines its own HashMap. Of course, you can pull in the parent module and disambiguate the references that way, but you can also use the as keyword to rename an item in the local context, such as use debugcollections::HashMap as DebugHashMap.

These semantics are similar to some other languages, such as Python. One detail that’s important to note is that the scope of the use is only the current module — any submodules will have to have their own use if they want to use the name, or use the one directly from the parent using use super::name.

One last thing to address is Rust’s privacy-by-default model applies to items pulled in with use as well — items added to the namespace of a particular module with use are private and can only be referred to by the same module or submodules. However, it’s possible to use pub use which makes the names public instead — this is known as re-exporting. This is typically useful to present a different public interface from a module than its private internal structure, but re-exporting some nested module tree from the crate root module.

Cargo Tooling

When you’re working on a Rust project, you’re probably almost exclusively going to work through the cargo tool. This has over forty subcommands, but to give you a flavour of the main functionality the two sections below list the ones that I felt were the most useful.

Build and Test

The compiler for rust is called, creatively enough, rustc. This handles both compilation and linking in a single tool. As with many modern languages, Rust also defines its own format for documentation strings within the code, and provides a tool called rustdoc to build documentation from them, similar to Doxygen, Javadoc and QDoc.

The cargo command provides a useful wrapper around these and other tools, however, and generally it’s way more convenient to use it as the only interface you’ll need most of the time. The main commands related to the build and test cycle are below.

cargo build
Compiles the whole project, including fetching and building any dependencies, and creates any targets (libraries and executables). By default this produces debug targets, but you can add --release to build with different defaults suitable for release code — these are optimised for performance rather than debugging. The exact settings used for each profile can be changed in Cargo.toml with a [profile.X] section. As with most modern compilers, the targets are generated in a separate subdirectory, in this case called target by default.
cargo check
Confirms that code would compile, but doesn’t actually produce any targets. This is quicker than actually doing a build, and is useful as a quick sanity check.
cargo run
This is a convenience for cargo build followed by running the binary just produced. It’s slightly less convenient if your package produces multiple binaries, because then you need to specify which one to run with --bin.
cargo fix
For compilation warnings and errors where the compiler has itself suggested a fix, this command applies those fixes. This is really convenient for things like semicolons in the wrong place or a missing type annotation somewhere.
cargo clean
Remove target output directories.
cargo test and bench
These run any tests or benchmarks, respectively, which have been added. I’m going to look at writing unit tests a little later in this article.
cargo doc
Runs rustdoc to generate the HTML documentation. This looks for special /// and //! comments within the code, and can also build from Markdown source files as well.

Package Maintenance

As well as commands to build and run code, cargo also provides commands to update the manifest file, as well as build and distribute packages. I’ve just included a flavour of these below to give you an idea.

cargo new and init
The cargo new command creates a new Rust project directory structure, with a new Cargo.toml and a default crate root source file. By default this creates a binary crate structure, or you can pass --lib for a library. It also initialises source control in the directory, unless it’s already a subdirectory in source control. All of these details and more can be tweaked with command-line options, as you’d expect. The cargo init command is the same, except it does so in the current directory instead of creating a new one.
cargo add and remove
Adds and removes dependencies from the Cargo.toml file. This is a convenience over editing the file yourself, and validates that the requested crate exists, and handles minutiae such as consistent style and keeping the list of dependencies in sorted order.
cargo install and uninstall
Installs (or uninstalls) applications distributed as binary crates — this can be from a local repository or the remote registry.
cargo package
Build a distributable .crate file containing the source files for the current package, which will be stored in target/package.
cargo publish
This uploads the current package to the registry — this is [https://crates.io] by default, but companies and other groups can set up their own registries if they wish. We’ll talk about publishing towards the end of this article.

Release Profiles

The compiler settings used to build are specified by the profile in use. There are four builtin profiles:

dev
For development builds — no optimisations, to make debugging easier.
release
For production code — optimisations enabled, no debugging symbols included.
test
For running unit tests — by default it’s the same as dev.
bench
For running benchmarks — by default it’s the same as release.

As well as these profiles, you can also add your own custom named profiles if you like.

For each profile, there are a whole lot of settings you can tweak from the defaults if you want to. I’m not going to go through the full list, but here are some examples.

opt-level
Specifies the level of optimisation from 0 for no optimisation to 3 for full optimisation for time, or s and z for different levels of optimisation for size instead of time. A separate lto option can be used to further enable LLVM’s link time optimisations.
debug
Controls the level of debugging info, where 0 is none and 2 is full debug info.
debug-assertions
Whether to include uses of the debug_assert! macro.
overflow-checks
Whether to include checks for integer overflow — if checks are included, overflow will trigger a panic.
panic
Controls the strategy to use to handle panics. The default in all releases by default is unwind, which unwinds the stack. However, setting this to abort will cause a panic to immediately abort the process.

I’d imagine the defaults are probably fine for a lot of projects, but it’s always nice to know that it’s easy to go back and tweak things later if you need to. The builtin support for running benchmarks also makes it quite convenient to do before and after comparisons of the performance under different compilation options, which also makes life easier.

Workspaces

You may recall near the start of this article I mentioned that you can only have a single library crate in a package. This might seem like a bit of an annoying limitation, but actually there’s a feature called workspaces which allows you to work on multiple such crates at a time.

A workspace is just a level of organisation above a package. It has its own Cargo.toml file, but instead of containing a [package] section it contains a [workspace] section which defines the packages which comrise the workspace. These are located within subdirectories, and each package has its own Cargo.toml file as normal.

The workspace is just a local grouping — you don’t publish an entire workspace, you just publish individual crates within it. This makes it easy to build, say, a large application with its libraries in packages all the in same repo, but still publish these for use in other projects if you want to. Alternatively, if you want to move those libraries out into their own repositories later (or merge in separate ones) then your code is already perfectly structured to make this easy.

I’ve always been a big proponent of developing an application as a series of libraries, as much as possible, keeping each one well encapsulated and treating its external API as well designed as if you were going to release that library as an artifact on its own. I believe this carries many benefits in terms of encouraging defining testable interfaces, minimising messy dependencies and minimising complexity. Hence, I’m pleased to see the Rust tooling have such explicitly good support for such a model.

Thoughts on Cargo

Compared with both Python and C++, I found Cargo a breath of fresh air. The consistent and clean interface makes building and publishing packages a breeze, and the fixed repository location conventions relative to the manifest file mean there’s very little that needs to be specified up front. The integration with the package registry, and the consistent use of a single registry by the whole community, means that dependencies are simple to deal with. Discoverability may be an issue, but I’m not going to be in a position to have an informed opinion on that until I’ve spent more time developing larger applications in Rust.

The use of profiles to group settings together is handy, and the builtin profiles for testing and benchmarking are one less thing to worry about when you want to extend your build in these areas.

From what little I know of Go’s tooling I suspect developers accustomed to that ecosystem will find Cargo more or less equivalent to the more modern versions of that language — as I understand it, dependencies were a bit of a pain there until about 4-5 years ago, but things seem to have settled down a bit. Also, the need to place things in a rigid hierarchy under $GOPATH was always rather irritating, but I gather that requirement disappeared a few years ago as well.

Automated Tests

Rust has some integral tooling and conventions to follow when writing automated tests, at which we’ll take a quick look in this section. Rust’s builtin testing facilities are quite basic, by design, lacking both fixutres and mocking. There are third party crates which add these features if you need them.

Writing Tests

To start with, let’s see what lib.rs source file we get when we create a new library crate using cargo new --lib.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
pub fn add(left: usize, right: usize) -> usize {
    left + right
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn it_works() {
        let result = add(2, 2);
        assert_eq!(result, 4);
    }
}

This is useful because it includes an initial unit tests function, so it shows us how they’re written. In this case you can see it’s included in a module tests and the it_works() function itself is annotated with #[test]. This annotation is how Rust’s test runners discover tests — any function annotated in this way is assumed to be an automated test, where the name of the function becomes the name of the test. This is important because unit test code many contain other functions which shouldn’t be run as tests.

As an aside, we’ll look at the purpose of [cfg(test)] in a moment, and the use super::* just brings everything from the parent module into the namespace of the tests submodule, as described previously in this article.

Tests can indicate a failure by triggering a panic, and indicate a pass by not doing so — this is convenient as we can use assert!(), assert_eq!() and assert_ne!() to implement our test results. Admittedly this set is considerably less rich than some other language unit test suites, although there are several crates available which implement more fluent assertions if you’d like to use them. Also, all these assertions can have an error message passed to them, and they apply format!() to that error in the same way as println!() does, so it’s not too hard to write your own helpful error messages.

This is all very well, but what about if the behaviour that you want to test is that your function does panic? Well, if you add an additional #[should_panic] annotation to your function then it inverts the normal behaviour — the test will fail if the code does not panic. To rule out accidental panics of a different nature, should_panic() can accept an expected argument which specifies that the supplied text must form a substring of the message passed to the panic.

As an alternative to a panic, a test can simply return a Result<T, E> to indicate the result of the test, where the Ok variant is a pass and the Err variant is a failure. Typically this might be Result<(), String> where the Err variant contains the string describing the failure condition.

Organising Tests

Now we’ve seen how to write individual test functions, it’s interesting to take a quick look at how Rust allows them to be organised. Since the test detection system is flexible, where they’re located in the source tree is mostly convention, but as with all conventions it’s worth sticking to unless you have some sort of solid reason to deviate from it.

I’m going to assume you’re already somewhat familiar with the difference between unit and integration testing, and I’m not going to elaborate on these terms3.

Unit Tests

For conventional unit tests, the convention is to include a tests submodule within the module under test. You can see this in the default library example above. The #[cfg(test)] annotation is how Rust performs conditional compilation, in this case only including the module if the test profile is in use. Another example of the use of this type of annotation would be something like #[cfg(target_os="linux")] for platform-specific logic.

One wrinkle that’s worth teasing out is that the fact that tests is a submodule allows it access to even the private functions within the parent module. Whether it’s good practice to test private functions at all is a matter of opinion, and I’m sure it’s not hard to find people making both sides of the argument if you poke around online. If you follow this convention of making tests a submodule, you at least have the option to do so and you can make your own decision whether that’s a good or bad idea for your own code.

I’ve mixed feelings about this idea of colocating tests and the functions that they exercise, as I’ve always followed the practice of keeping them separate. However, I’m going to run with it for awhile and see how I find it — I suspect it’s one of those things that doesn’t matter a great deal one way or the other, as long as you know what to expect. The use of conditional compilation means there’s no actual impact on the built artifacts, so it’s very much just a matter of convenience and preference.

Integration Tests

In contrast, integration tests are generally stored in a top-level tests/ directory next to src/. Each source file creates its own crate, expected to be a suite of tests, so it must import the libraries explicitly and will only have access to the public members. There’s no need for conditional compilation with #[cfg(test)] because Cargo will only compile and run these files when cargo test is used.

Functions are still annotated with #[test] and success or failure can still be indicated in the same way. There isn’t really a great deal of difference between these test cases and the unit tests described above except for the scope of testing and the lack of access to private members.

It’s worth noting that because each source file is treated as a suite, creating submodules can be a little fiddly as you have to put these in subdirectory. Instead of creating testlib.rs you can create testlib/mod.rs to still create a module called testlib — this is the older naming convention that Rust used to use, which has mostly been superceded but is still useful in this particular case.

It’s also important to note that binary crates can’t be imported, so cannot have integration tests. In general the main.rs for a binary crate should be kept pretty small and all the actual business logic pushed into a library, so it can be adequately tested.

Doc Tests

When you run cargo test, the unit and integration test results will be presented each in their own section. There’s also a third section which represents a type of test we haven’t talked about yet — these are doc tests.

When you write API documentation strings (i.e. comments starting /// or //!) for your functions, you might want to add example code blocks. If you do so, these are actually also tested when you run cargo test. I don’t think these are really intended for conducting serious testing, but it’s more to validate that the examples in the documentation haven’t become out of date and misleading by testing them against the real code.

Executing Tests

Now we’ve seen how to write and organise tests, I guess we’d better see how to run them. When you execute tests using cargo test you have various options to control which tests are run and how they’re run, and we’ll take a quick look at these options here.

The first thing to note is that by default the tests are executed in parallel across multiple threads of execution. This means even a large test suite can be run quickly, but it does mean that if your tests have dependencies on shared resources, such as databases or files on disk, you have to be very careful that these don’t conflict when run in parallel.

If you have no choice but to use such conflicting resources, you can disable parallelism with cargo test -- --test-threads=1. You can also provide other values to have more fine-grained control over the parallelism.

The next issue is whether you want to see output from passing tests. By default output is captured and only shown if a test fails — you can override this with --show-output.

Finally, you’ll often want to run only a subset of tests. You can specify cargo test <name> to specify a filter on the test name. You can only specify one such filter, but it uses a substring match on the test names, so the filter can resolve to multiple tests.

If you have particular tests that you usually want to skip, such as tests which take a long time or require specific manual setup, then you can annotate them with #[ignore] as well as #[test] and these will be ignored unless specifically included. You can run just the ignored tests with --ignored, or run them along with the others using --include-ignored.

Community Testing Crates

Building on this basic support, there are additional crates which add more advanced facilities which developers keen on TDD may regard as essential. Here are a few salient examples:

Fixtures
The rstest crate provides facilities for convenient fixtures and dependency injection. You can create a function annotated with #[fixture], and then pass this into the tests to inject a dependency. You can also use a #[case()] annotation to parameterise a test — a new test case is created for each supplied combination of parameters. This is helpful for using a single test function to test code in the same way against a number of different inputs.
Mocking
Several options, the most popular seems to be mockall which allows mocking of almost any trait or structure. After creating a mock, you can add expectations of the calls that will be made on it and add behaviours such as returning particular values. It defines a rich library of predicates for placing constraints on argument values, and provides an #[automock] annotation for structures which can automatically generate a mock for it.
Serialising Tests
The serial_test crate adds a #[serial] annotation which forces all such tests to be executed sequentially even if multiple testing threads are used.
Property-Based Testing
If you’re a fan of property testing, sometimes known as generative testing and as originally implemented by the QuickCheck library for Haskell, then you may want to take a look at the Rust port quickcheck. For those who aren’t familiar, this type of testing involves declaring properties of the function you wish to remain true (e.g. it doesn’t crash, the return value is always lower than the parameter) and then confirming that these properties continue to hold across a large number of randomly generated inputs. The inputs are also “shrunk” to a minimal test case which causes a failure, to make debugging easier. You may also want to take a look at proptest, which is inspired by the Hypothesis library for Python, as it provides richer strategies for shrinking to a minimal test case, at the expense of slower run time for tests.

Thoughts on Testing

It’s good to see that Rust has got the basics of detecting and running tests built in, and the fact that tests are run in parallel by default is a really nice touch — the fact that you need to do something special to run tests in sequence is a useful nudge away from relying on external resources, which is often a sign of not only poor testability but also poor design4.

I’m sure some people will think that the testing support is too minimal for a modern language. I would point to the fact that Rust is still comparatively young — it’s been eight years since its first appearance — and there are already mature facilities available in the community to add facilities that people want. By way of comparison, mocking wasn’t pulled into the Python standard library until Python 3.3, which was released in 2012, over two decades after Python first appeared. So there’s still plenty of time for those community libraries migrate into the standard library.

I must confess, I’m not a total adherent to TDD — at least not from a purist standpoint. What I mean by a “purist standpoint” is the suggestion that the developer should write a single test case, then write the minimal code for it to pass, and then add another test case and continue in this cycle until they have a full application. In my opinion this is inefficient and can push architectures into blind alleys, thus requiring large-scale refactors in the future when additional requirements are added. Often these requirements could easily have been factored in at the start to avoid the need for such sweeping revisions.

On this basis the Rust builtin support looks sufficient for the way that I tend to test code, at least, and there’s plenty of popular options to choose from in the community if you want to go further. If you find yourself needing very advanced approaches to testing in order to validate your application, I would say it’s worth considering whether these are just compensating for poor design or architectural decisions somewhere along the line, such as a badly chosen abstraction or encapsulation.

Of course, I may change my opinion after using Rust for more advanced tasks — that remains to be seen!

Publishing Crates

So, you’ve written some amazing library, you’ve tested it to within an inch of its life, and you’ve riddled it with helpful documentation. How do you release it to the world?

The answer, for most people, will be through the standard Rust registry, crates.io. Before you upload it, though, you’ll want to make sure that the [package] section in your Cargo.toml file has the right metadata. I’ve previously mentioned that the only mandatory fields in this file are name and version, but to publish a package you also need description and license specified — the license should be one of the identifiers from the list maintained by the Linux SPDX5. You can check out the Cargo documentation to see what other metadata you can specify. If you’re going to publish then you should probably at least consider specifying categories, to make your crates more discoverable.

Metadata ell entered, you’re ready to upload. You’ll need a GitHub account, and then you’ll need to log in to crates.io and grab an API key which you pass to the cargo login command. This key will be stored in ~/.cargo/credentials for future use.

Because all the metadata is already there in Cargo.toml, publishing a new version of a crate is as simple as running cargo publish. You should be careful, however, because each version on the registry is permanent and immutable once it’s published. This is by design, to stop malicious actors from changing or removing versions of crates on which projects may have come to depend. You can use the cargo yank command to mark a version as deprecated, and this prevents new projects from using it, but doesn’t stop old projects already using it from downloading it or building with it.

To publish another version, just update the version field and repeat the process — Rust packages follow the usual semantic versioning rules with a major.minor.patch format.

Conclusion

That wraps it up for this article, hopefully that’s been a useful whirlwind tour of Cargo and how it’s used for the build process. I deliberately haven’t gone into the long weeds of the underlying tools like rustc or rustdoc, and there are other peripherally useful tools like rustfmt for auto formatting code, and if you’re interested in these you can find lots of details in the Rust documentation.

Overall I like Cargo a lot — I believe it strikes an excellent balance between convenience and structure. Overly restrictive tooling is just irritating, such as early versions of Go requiring you to put all your source code under a single root directory — thankfully that was fixed in later versions. Visual Studio always used to be the prime example of this, where it made it unreasonably painful to stray outside of its walled garden IDE and compiler — thankfully it too is more flexible than it used to be (and I’ve also almost never had to use it in my career anyway).

Equally, however, cobbling together some SDLC from the Wild West mishmash of tooling available for C++ programmers can be equally annoying — I went to spend my time writing code, not configuring build and test systems. Python has historically also slipped a bit into this category, although there are now enough de facto standard tools that at least building, testing and formatting are solved problems. Distribution of Python apps is still a bit of a mess in my view, but at least it has a standard registry for libraries and tooling to fetch them, as Rust does.

That’s all I have for you in this article — in the next in the series, I’ll take a look at some of the more functional aspects of the language, such as iterators and closures. We’ll also take a look at Rust’s smart pointers and building structures that own memory.


  1. These will be *.so files on Linux, *.dll files on Windows and *.dynlib files on MacOS. 

  2. Instead of abc.rs it’s also possible to define a module in abs/mod.rs — this was an older naming convention that’s still supported. I wouldn’t recommend it because, as their documentation say, you end up with a lot of files called mod.rs which is going to make navigating your editor tabs really annoying. 

  3. I’m well aware these terms are a little fuzzy, and some people may disagree on the specifics, but I think the contrast between them is clear enough for the purposes of the discussion here. As long as you understand the way things work, you can always choose to use them however you like. 

  4. If you’re not sure about this, I strongly suggest you want an excellent talk by Micheael Feathers from 2010 called The Deep Synergy Between Testability and Good Design. He explains why he thinks that code that’s hard to test often has architectural flaws which are nothing to do with the difficulty in testing it. 

  5. If you want to use a license which isn’t covered here, you can also put the text of it in a file (e.g. LICENSE) and specify that filename with license-file in Cargo.toml instead of using license

The next article in the “Uncovering Rust” series is Uncovering Rust: Closures, Iterators and Smart Pointers
Sun 28 May, 2023
12 May 2023 at 9:26AM in Software
 | 
Photo by Pixabay on Pexels
 |