Years ago, Andrew Kennedy published a foundational paper about a type checker for units of measure, and later implemented it for F#. To this day, F# is the only mainstream programming language which provides first class support to make sure that you will not accidentally confuse meters and feet, euros and dollars, but that you can still convert between watts·hours and joules.
I decided to see whether this could be implemented in and for Rust. The answer is not only yes, but it was fun :)
Units of Measure as newtypes
Units of Measure are, well, labels attacked to things that you can measure: seconds, hours, miles per hour, dollars per galleon, watts multipled per hours, etc. A speed (that’s a measure) can be in miles per hours, kilometers per hours, multiples of lightspeed (that’s units).
If you confuse your units, you can get a disaster – setting the temperature of your personal oven ors heater in Celcius instead of Fahrenheit is clumsy, and in an industrial setting, this can kill people. With similar errors, NASA lost a Mars Orbiter.
In Rust, the classical way to ensure that you do not confuse these measures is to newtype each measure:
extern crate derive_more;
/// A measure in kilometers.
/// - #[derive(Add)] because such numbers can be added to each other.
/// - #[derive(Sub)] because such numbers can be subtracted from each other.
/// - #[derive(Default)] as zero value.
/// - do NOT #[derive(Mul)] or #[derive(Div)] because a kilometer multiplied by a kilometer is a square kilometer, not a kilometer.
#[derive(Add, Sub, Default)]
struct Km(f64);
/// A kilometer divided by a kilometer is a number without unit.
impl Div for Km {
type Output = f64;
fn div(self, other: Self) -> f64 {
self.0 / other.0
}
}
/// A measure in seconds.
#[derive(Add, Sub, Default)]
struct S(f64);
/// A second divided by a second is a number without unit.
impl Div for S {
type Output = f64;
fn div(self, other: Self) -> f64 {
self.0 / other.0
}
}
fn main() {
let distance = Km(300_000.);
let duration = S(1.);
let delta = distance - duration; // ERROR: Doesn't type-check.
}
That’s good and it works well.
What if we want to add speed?
extern crate derive_more;
/*
Skip the above.
*/
/// A speed in km·s^-1 (aka kilometers per seconds).
#[derive(Add, Sub, Default)]
struct KmPerS(f64);
/// A value divided by a value of the same unit is a number without unit.
impl Div for KmPerS {
type Output = f64;
fn div(self, other: Self) -> f64 {
self.0 / other.0
}
}
/// Implement Km / S => KmPerS
impl Div<S> for Km {
type Output = KmPerS;
fn div(self, rhs: S) -> Self::Output {
KmPerS(self.0 / rhs.0)
}
}
/// Implement KmPerS * S => Km
impl Mul<S> for KmPerS {
type Output = Km;
fn div(self, rhs: S) -> Self::Output {
Km(self.0 * rhs.0)
}
}
/// Also implement S * KmPerS => Km
impl Mul<KmPerS> for S {
type Output = Km;
fn div(self, rhs: S) -> Self::Output {
Km(self.0 * rhs.0)
}
}
fn main() {
let distance = Km(300_000.);
let duration = S(1.);
let speed = distance / duration;
let distance_2 = speed * duration;
let _ = distance - distance_2; // PASS
let distance_3 = duration * speed;
let _ = distance - distance_3;
}
Ok, that works, too. It’s starting to be verbose but it works.
But what if we wish to add acceleration in kilometers per second squared? Well, we need to:
- create another
struct KmPerSSquared
; - derive addition, multiplication, zero;
- implement
Div<KmPerSSquared> for KmPerSSquared
; - implement
Div<S> for KmPerS
; - implement
Mul<S> for KmPerSSquared
.
Oh, wait, there is another way to obtain a kilometer per second squared. It’s to divide kilometers per (second squared). So we may need to implement seconds squared, with all their operations. Of course, seconds squared divided by seconds should also be seconds, seconds multiplied by seconds should be second squared, …
That’s yet another structure with its derived addition, multiplication, zero, four more implementations, etc.
Hopefully, by now, I have convinced you that this technique doesn’t scale!
Units of Measure, generalized
Now that we have seen that the manual method works but doesn’t scale, let’s see if something more generic would do the trick.
extern crate yaiouom;// Yet Another Implementation Of Units Of measure.
use yaiouom::Measure;
/// Kilometers. This type has nothing special.
struct Km;
/// Seconds. This type has also nothing special.
struct S;
fn main() {
// Define our distance and duration, once again.
// Note that the `f64` is optional. Rust will infer it nicely.
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
// let error = distance + duration; // ERROR: Caught by type-checking.
let speed = distance / duration; // PASS
let acceleration = speed / duration; // PASS
}
The type of speed
is Measure::<f64, yaiouom::Div<Km, S>>
– aka “speed
is a f64
measure in Km/S
”.
Similarly, the type of acceleration
is Measure::<f64, yaiouom::Div<uom::Div<Km, S>> , S>
– aka “acceleration
” is a f64
measure in Km/S/S
, all nicely infered.
Well, that’s already a nice improvement!
Unfortunately, this approach also quickly runs into its own limitations:
fn main() {
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
let speed = distance / duration; // PASS
let acceleration = speed / duration; // PASS
let acceleration_2 = distance / (duration * duration); // PASS
assert_eq!(acceleration, acceleration_2); // ERROR: Type-checking fails.
}
What has happened here? Well, acceleration
is a measure in (Km/S)/S
, while acceleration_2
is a measure in Km/(S*S)
. In terms of units of measure, these are exactly the same things, but the type system doesn’t know about it.
In fact, what’s missing here is a set of type-level equational laws. If we call Unitless
the pseudo-unit of unitless values
and if we rewrite A/B
as A*B^-1
, we just to define ≡
as the smallest equivalence relation such that
- for all units
A
,B
,A*B ≡ B*A
(Commutativity) - for all unit
A
,A*A^-1 ≡ Unitless
(Inverse) - for all unit
A
A * Unitless ≡ A
(Neutral element) - for all units
A
,B
,C
,A*(B*C) ≡ (A*B)*C
(Associativity) - for all units
A
,B
,C
, ifA ≡ B
thenA*C ≡ B*C
(Congruence of Multiplicatives) - for all units
A
,B
, ifA ≡ B
thenA^-1 ≡ B^-1
(Congruence of Inverses)
Is this sufficient? Well, for the case of acceleration
and acceleration_2
, we only need Associativity:
- Proving that
(Km/S)/S ≡ Km/(S*S)
is the same thing as proving that(Km * S^-1) * S^1 ≡ Km * (S^-1 * S^-1)
(we’re just changing notations). - Thanks to Associativity, we know that
(Km * S^-1) * S^1 ≡ Km * (S^-1 * S^-1)
. - So, we have a proof, our code is safe (for some reasonable definition of safe)!
Well, that’s all nice and dandy, but how do we teach this to the Rust compiler?
We’re going to introduce a refinement type.
Before giving more details, let me show you the final result:
fn main() {
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
let speed = distance / duration; // PASS
let acceleration = speed / duration; // PASS
let acceleration_2 = (distance / (duration * duration)).unify(); // <-- This line has changed
assert_eq!(acceleration, acceleration_2); // PASS
}
Ohhh… a mysterious method unify()
that magically solves our issue. Let’s look at its source code:
impl<T, U> Measure {
pub fn unify<V>(self) -> Measure<T, V> {
Measure::new(self.0)
}
}
This method is… simply discarding the unit of measure. Surely, that can’t be safe (for some reasonable definition of safety)?
Let’s confirm this:
fn main() {
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
let _ = distance.unify() - duration; // PASS
}
Indeed, we have just managed to subtract a duration from a distance, that’s clearly worthless.
Let’s see how we can make it safe!
A refinement type-system for Units of Measure
The Rust toolchain has many tricks up its sleeves. One of them is the ability to use the compiler as a library and build static analysis tools on top of it.
Now, instead of rustc
, let’s run yaiouom-checker
.
First with our good sample:
fn main() {
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
let speed = distance / duration; // PASS
let acceleration = speed / duration; // PASS
let acceleration_2 = (distance / (duration * duration)).unify();
assert_eq!(acceleration, acceleration_2); // PASS
}
ok, the good sample still passes. So far, so good.
Now with the bad sample:
fn main() {
let distance = Measure::<f64, Km>::new(300_000.);
let duration = Measure::<f64, S>::new(1.);
let _ = distance.unify() - duration; // ERROR: Doesn't type check anymore!
// Error:
// ----------------- in this unification
// = note: expected unit of measure: `S`
// found unit of measure: `Km`
}
…and the error was caught!
Let’s look at the steps involved in a run of yaiouom-checker
:
- run the usual
rustc
type and region inference; - using the result of type inference, determine all the callsites of our method
unify
; - using the result of type inference, at each of these callsites
- determine the type of
self
(let’s call itMeasure::<_, U>
); - determine the type of result of the the call to
unify
(let’s call itMeasure::<_, V>
); - using the equational laws defined earlier, check that
U ≡ V
; - if we cannot prove that
U ≡ V
, minimize the error message and display it; - if all instances were proven, accept the code, otherwise reject it.
Indeed, yaiouom-checker
is an implementation of a refinement type system for Rust: a type system that takes as input the result of a previous type system and introduces new verifications/guarantees. The unit of measure type system fits in roughly 200 lines of code, connected to rustc
by about 200 more lines of code.
Status of the project
This project is very much a proof of concept. If I find time, I’d like to eventually make it more robust, perhaps submit it as an extension of clippy
.
In the meantime, if you’re interested, the code is here!