Nine Rules to Formally Validate Rust Algorithms with Dafny by Carl Kadie & Divyanshu Ranjan

The `internal_add` function tries to efficiently insert a new range of integers into an existing list of sorted and disjoint integer ranges. For example, if we started with `[101..=102, 400..=402, 404..=405]` and added `402..=404`, we expect a result of `[101..=102, 400..=405]`.

Ideally, I’d formally verify this algorithm with Rust-specific tools [1,2]. Those tools, however, seem hard to use. Instead, I chose Dafny. Dafny is a language and verification system. It is taught to undergraduates at universities around the world. It is used in industry. I find it to be addictively interactive and programmer friendly.

Aside: Dafny creator, Dr. Rustan Leino has a connection to Rust beyond the coincidence of his first name. He helped create Spec#, the first language to use a type system to avoid null pointers. Rust, of course, adopted this idea to great success.

This article covers rules 1 to 6. Part 2 will cover rules 7 to 9.

Before trying to prove the mathematical correctness of your algorithm, decide if the effort is worth the benefit.

Dafny is not Rust. Using Dafny requires porting algorithms of interest from Rust to Dafny. This port can miss details and introduce errors. Given this risk, should you use Dafny to verify Rust algorithms? I boldly claim that “it depends”.

• How important is your algorithm’s correctness? If you are printing a report and it looks right, it probably is right. The `internal_add` algorithm relates to a data structure that I’d like others to use with confidence, giving me extra motivation to verify it.
• Maybe all formal verification, with current tools, is too hard. I believe, however, that Dafny makes formal verification as easy as currently possible. You will find formally verifying code easier if you are already familiar with types (for example, from Rust) and recursion/induction (used infrequently in Rust). You can read this article and decide for yourself if/when formal verification is easy enough to be valuable to you.
• Maybe fuzzing (such as `cargo-fuzz`) and property-based testing (such as `QuickCheck`) are good enough. Although these methods do not provide mathematical certainty, they are clever, useful, and easy to use. (The `range-set-blaze` crate already uses `QuickCheck. See `Rule 9.5 in a previous article for details).
• Maybe formal verification is and will always be doomed because writing a specification is as hard as writing code. I disagree. Think about refactoring. I often start coding by writing something simple. I then refactor this simple code for efficiency. For `internal_add`, I found the specification to be simpler than any code. (You can judge this for yourself in Rule 4.)

Aside: Verification then becomes a computer-checked refactoring from a simple specification to a final, efficient algorithm.

• Maybe formal verification is and will always be doomed because the halting problem tells us formally that formality isn’t generally possible. The halting problem does not doom us. While we can’t always understand arbitrary code, we don’t need to. We only need to understand our own code, which we (hopefully) wrote to be understandable. Starting in Rule 2, we’ll see how Dafny easily verifies that specific loops and recursions halt.
• Maybe porting to Dafny is too hard. This has not been my experience. Like Rust, Dafny mixes and matches imperative and functional programming. I found porting my algorithm to be straightforward.

Assuming you still want to verify your algorithm with Dafny, your next step is to learn Dafny.

Dafny is both a programming language and an interactive verification system. I recommend you install it as a VS Code extension.

To learn it, start at https://dafny.org/. Of special interest is the Online Tutorial and the Reference Manual. I also found the Verification Corner videos on YouTube helpful. (Of possible interest is the college textbook, Program Proofs, \$49 for the Kindle Edition). I found the programming language part of Dafny easier to learn than Rust, perhaps similar in difficulty to C#.

Dafny, like Rust, is fully typed. Dafny, like Python, is garbage collected. Here is a “Hello World”:

`// hello.dfymethod Main(){var s := "Hello World";print s, "\n";}`

Dafny, also like Python, offers integers of arbitrary size. Here is a program that provably adds two natural numbers by repeatedly incrementing.

Some points of interest:

• Dafny coding guidelines follow C#, not Rust. So, we name the function `SlowAdd` not `slow_add` (although either will run).
• Dafny supports subtypes. For example, any `int` that can be shown to be non-negative is also a `nat`.
• Assignment is `:=` and equality is `==` . (There is no `=` .)
• Function parameters, for example, `x` and `y` above, are immutable.
• Dafny uses `ensures` and `invariant` statements to verify the code at compile-type. It then removes these statements to finish compiling.
• The green check mark shows that this code verifies. Dafny’s VS Code extension will, by default, continuously try to validate each method. This adds an almost gambling-like excitement to working with Dafny. In the example above, if I make `y` an `int` rather than a `nat`, then validation should and will fail. (Can you figure out why?) Dafny will mark my function with a red X and tell me “`This postcondition might not hold: r == x + y`”.
• Dafny knows some of the mathematics of integers, arrays, sets, maps, sequences, etc. This often allows it to finish the last details of validation by itself.

The `range-set-blaze` crate represents sets of integers as sorted, disjoint ranges. For example, this list of three ranges:

`100..=2_393,20_303..=30_239_000,501_000_013..=501_000_016`

represents a set of 30,220,996 integers.

In Rust, the `RangeSetBlaze` struct represents this data structure internally with a standard `BTreeMap.` Recall that a `BTreeMap` represents a list of key/value pairs, sorted by key. Here, our keys are the ranges’ starts (for example, `100`, `20_303`, `501_000_013`) and the values are the ranges’ inclusive ends (for example, `2_393`, `30_239_000, 501_000_016`. `RangeSetBlaze` stores the list with a `BTreeMap` rather than a `vec` to make key look up more cache friendly.

`RangeSetBlaze` depends on `BTreeMap`, so must we implement `BTreeMap` in Dafny? Happily, no. We can, instead, use Dafny’s `vec`-like `seq` data type. This substitution works because `BTreeMap`, `vec`, and `seq` can all represent sorted lists — just with different efficiencies. For the purpose of formal verification, we only care about correctness and can ignore efficiency.

`RangeSetBlaze` requires the list of ranges be sorted and disjoint. How do we say “sorted and disjoint” in Dafny? We can say it via this ghost predicate (and related code):

`ghost predicate ValidSeq(sequence: seq<NeIntRange>) {(forall i:nat, j:nat | i < j < |sequence| :: sequence[i].0 < sequence[j].0)&& (forall i:nat, j:nat | i < j < |sequence| :: !Touch(sequence[i], sequence[j]))}type IntRange = (int, int)type NeIntRange = x: IntRange | !IsEmpty(x) witness (0,0)function IsEmpty(r: IntRange): bool{r.0 > r.1}`

A predicate is another name for a method that returns `bool`. A ghost method (or predicate) is one that can only be used for validation, not for running the final code.

At a high level, the `ValidSeq` predicate takes as input a sequence of non-empty integer ranges. It then tests that the start values are sorted and that the ranges don’t touch. Specifically,

• An `IntRange` is a tuple of two `int` values.
• An `IntRange` `IsEmpty` exactly when its start is greater than its end. (This follows Rust’s convention.)
• A `NeIntRange` (non-empty integer range) is an `IntRange` that is not empty, for example, `(0,0)`. [All our ranges are end inclusive.]
• This expression tests that the start values are sorted:
`forall i:nat, j:nat | i < j < |sequence| :: sequence[i].0 < sequence[j].0`

It can be read as “for all natural numbers i and j — such that i is less than j and j is less than the length of the sequence — test that the start value at index i is less than the start value as index j”.

Aside: Note that a Rust `BTreeMap` doesn’t support (random-access) indexing but here we are using such indexing. This is OK because `ValidSeq` is a ghost predicate and so will only be used for validation.

• This expression tests that the ranges are disjoint:
`forall i:nat, j:nat | i < j < |sequence| :: !Touch(sequence[i], sequence[j])`

It can be read as “for all natural numbers i and j — such that i is less than j and j is less than the length of the sequence — test that the range at index i does not touch the range at index j. But what is `Touch`?

We’ll define `Touch` on two-levels. On a mathematical level, a range i is said to touch a range j if there exists an integer i0 in range i and an integer j0 in range j such that i0 and j0 are within a distance of one of each other. On an efficient programming level, we want to avoid definitions depending on “there exists”. Here is a Dafny predicate that is both mathematical and efficient:

`predicate Touch(i: NeIntRange, j: NeIntRange)ensures Touch(i, j) == exists i0, j0 ::Contains(i, i0) && Contains(j, j0) && -1 <= i0 - j0 <= 1{assert Contains(i, i.0) && Contains(i, i.1) && Contains(j, j.0) && Contains(j, j.1);if i.1 < j.0 thenassert  (-1 <= i.1 - j.0 <= 1) == (i.1+1 == j.0);i.1+1 == j.0else if j.1 < i.0 thenassert (-1 <= j.1 - i.0 <= 1) == (j.1+1 == i.0);j.1+1 == i.0elsevar k0 := Max(i.0, j.0);assert Contains(i, k0) && Contains(j, k0);true}function Contains(r: IntRange, i: int): bool{r.0 <= i && i <= r.1}function Max(a: int, b: int): int{if a < b then b else a}`

Some points of interest:

• `Touch` is not a ghost. In other words, we can use it in both regular code and validation code.
• The `assert` statements help Dafny prove that the regular code meets the mathematical `ensures` statement.
• For efficiency, the Dafny prover validates the inside of a `method `separately from its outside. Only the `ensures` (and the yet-to-be-seen, `requires`) statements cross this border. In contrast to a `method`, a Dafny `function` is transparent to the validator. (I think of it as inlining code with respect to validation.)

With concepts such as `ValidSeq` and `Touch` defined, we next move onto specifying what our algorithm is supposed to do.

Ultimately, I want to prove that my specific Rust algorithm for inserting a new range into a `RangeSetBlaze` is correct. Before we do that, however, let’s define what “correct” range insertion is.

`method InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)requires ValidSeq(xs)ensures ValidSeq(rs)ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a){if IsEmpty(a){rs := xs;}else{assume false; // cheat for now}}`

This says that `InternalAdd` is a method that takes `xs`, a sequence of non-empty integer ranges, and `a`, an integer range (that could be empty). The method outputs `rs`, a new sequence of non-empty integer ranges.

We need to say that `xs` and `rs` must be sorted and disjoint. That is easily done with the `ValidSeq`’s in the `requires` and first `ensures`.

We also need to say that `rs` contains the right stuff. Is this hard? It is not. We just say that the set of integers in `rs` must equal the set of integers in `xs` unioned with the integers in `a`.

Aside: In Dafny, “+” when applied to sets is “union”.

The set of integers in a range is:

`ghost function RangeToSet(pair: IntRange): set<int>{set i {:autotriggers false} | pair.0 <= i <= pair.1 :: i}`

And the set of integers in a sequence of non-empty ranges can be define inductively (that is, recursively):

`ghost function SeqToSet(sequence: seq<NeIntRange>): set<int>decreases |sequence|requires ValidSeq(sequence){if |sequence| == 0 then {}else if |sequence| == 1 then RangeToSet(sequence[0])else RangeToSet(sequence[0]) + SeqToSet(sequence[1..])}`

Some points of interest:

• The line: `assume false; // cheat for now` makes validation work even if it really shouldn’t. We use it as a temporary placeholder.
• We make `RangeToSet` and `SeqToSet` ghosts to stop us from using them in regular code. We make them functions (instead of methods) to inline them with respect to validation.
• Because Dafny knows a lot about creating and manipulating sets and sequences, we often profit by using sets and sequences in our specification.
• Even if our regular code uses loops instead of recursion, our validation code will often use recursive-like induction.
• The `{:autotriggers false}` relates to avoiding a warning message. For more information see this Stack Overflow answer by Prof. James Wilcox.

We now have a formal specification of `InternalAdd`. I find this specification short and intuitive. But what if you need help figuring out a specification or other Dafny code?

The main forum for Dafny questions is Stack Overflow. To my surprise, I actually received much useful help there.

I recommend starting your question’s title with “Dafny:”. Also, be sure to tag your question with `dafny` and, perhaps, `formal-verification`.

Aside: On the site, you can see my 11 questions and Divyanshu Ranjan’s 48 Dafny-related answers.

As an open-source project on GitHub, Dafny also hosts GitHub Discussions and Issues.

The Dafny community is small but seems enthusiastic about helping users and improving the project.

With help at hand, we must next find an algorithm that meets the specification.

As a novice to formal verification, I decided to postpone work on the real `internal_add` used in my Rust code. Instead, I started work on an `InternalAdd` algorithm that I hoped would be easier to validate. I ended up with this:

`method InternalAdd(xs: seq<NeIntRange>, a: IntRange) returns (rs: seq<NeIntRange>)requires ValidSeq(xs)ensures ValidSeq(rs)ensures SeqToSet(rs) == SeqToSet(xs) + RangeToSet(a){if IsEmpty(a){rs := xs;}else{var notTouching, merged := PartitionAndMerge(xs, a);var indexAfter := NoTouchIndexAfter(notTouching, merged);rs := InsertAt(notTouching, [merged], indexAfter);}}`

The idea is that if range `a` is empty, we return the input sequence unchanged. Otherwise, we divide the work into three steps, which we can validate independently. The first step, `PartitionAndMerge,` returns:

• `notTouching`, a sequence of ranges that don’t touch range `a`, and
• `merged`, a single range created from `a` and everything it touches.

Here is an example input and output:

`InternalAdd` next finds where to insert `merged` and, finally, inserts it.

Here is the code for `PartitionAndMerge:`

`method PartitionAndMerge(xs: seq<NeIntRange>, a: NeIntRange) returns (notTouching: seq<NeIntRange>, merged: NeIntRange)requires ValidSeq(xs)ensures ValidSeq(notTouching)ensures RangeToSet(merged) >= RangeToSet(a)ensures forall range | range in notTouching :: !Touch(range, merged)ensures SeqToSet(xs) + RangeToSet(a) == SeqToSet(notTouching) + RangeToSet(merged){// Split into touching and not touching seqsvar touching: seq<NeIntRange>;touching, notTouching := Partition(a, xs);// Merge the touching seq into one range with our original rangemerged := UnionSeq(a, touching);}`

This says that `PartitionAndMerge` `requires` that `xs` be a valid sequence of non-empty integer ranges and that `a` be a non-empty integer range. It ensures that `nonTouching` is another valid sequence of non-empty integer ranges. It ensures that the integers in range `merged` are a superset of those in range `a`. It ensures that no range in `notTouching` touches range `merged`. And finally, it ensures that the integers in `xs` and `a` are exactly the same as the integers in `notTouching` and `merged`.

`PartitionAndMerge` also divides the work, this time into two steps (`Partition `and `UnionSeq`) that can be validated independently. Those steps continue to subdivide their work. Where does it end? Let’s look at one example.

The method `UnionSeq` calls `UnionRange` which merges two ranges:

`function UnionRange(x: IntRange, y: IntRange): IntRangerequires IsEmpty(x) || IsEmpty(y) || Touch(x, y)ensures RangeToSet(x) + RangeToSet(y) == RangeToSet(UnionRange(x,y)){if IsEmpty(x) then yelse if IsEmpty(y) then xelse (Min(x.0, y.0), Max(x.1, y.1))}`

The `UnionRange` code handles the empty cases and then returns the minimum bounding range. (The minimum bounding range is the range from the smaller of the two starts to the larger of the two ends.) But how can this be correct? In general, a minimum bounding range of two ranges might include extra integers. We might get something bigger than the union of the inputs, like so:

The code is correct because it `requires` that the two input ranges touch or are empty. This `ensures` that the union of the integers in range `x` with the integers in range `y` are exactly the integers in the output range.

At compile time, Dafny proves this function correct. Beyond that, it proves that everything that calls this function provides inputs that are empty or touching.

I think of this as a generalization of Rust’s borrow checker. At compile-time Rust checks that we are safe from many memory errors. At compile time, verification systems, such as Dafny, can prove almost arbitrary properties. Of course, as we are seeing, this ability comes at the cost of complexity.

The full code for this verified algorithm is about 200 lines, organized into about a dozen methods and functions.

This rule shows that we can verify an algorithm for `InternalAdd`, but it is not the algorithm I used in Rust. We will turn to that next.

This post originally appeared on TechToday.