I appreciate how “lint everything and have zero warnings” is more or less the default position on projects after decades of ignoring compiler warnings.
Just that alone + “unused binding” warnings prevent typos from being an issue in a major sense. And on the other side of that, being able to refer to constants in pattern matching sounds useful! In theory at least. In practice I’m often finding myself only doing pattern matching as a “last resort” because there’s often some helper function that more directly gets me where I need to be.
This is a tricky corner of any language that has pattern matching. In a pattern, you naturally want a concise syntax for:
Binding a new variable to the matched value.
Matching if the value is equal to the given named constant.
The ideal syntax for both is simply a bare identifier, but then you’ve got ambiguity. Every language with pattern matching has to resolve that somehow. Rust’s approach is to make the language non-context-free: if an identifier in a pattern resolves to a constant, then it’s 2, otherwise it’s 1. That’s the footgun the article is about.
Some other ML-derived languages make identifier case significant in the language. Variables must be lowercase and constants must be capitalized.
I think there are a few that simply don’t support (2) at all. They only support literal constant patterns like 3 and type constructor patterns like Some(...).
Swift does a clever but somewhat complex thing. It has two difference syntaxes for patterns in refutable and irrefutable contexts. In a refutable context, an identifier is always a constant (2). If you want to bind a variable you have to do so explicitly using let or var. In an irrefutable context, you can’t compare to constants at all (since that’s refutable) so (2) is meaningless and an identifier always binds a new variable.
The result is that patterns are nice and terse for their two uses: When declaring multiple variables using a pattern, there’s just one leading let or var like:
let (a, b) = (1, 2)
And in a switch statement where checking against constants is more common than destructuring, you just use the name of the constant and it does what you want.
The downside is that users have to understand that the way to write a pattern depends on where it appears. It’s fairly complex and subtle and trips users up.
When we added pattern matching to Dart, we borrowed Swift’s approach. We already had switch statements where a bare identifier behaves like a constant. And, of course, we already had variable declarations where an identifier names the new variable. So to add patterns in both of those contexts without breaking all the world’s existing Dart code, we felt the best approach was to have a contextual syntax like Swift.
It seems to work out pretty well, though it’s more subtle than I’d prefer.
GOD is bound as a variable, just like the match arm below it. Rust does not enforce any casing rules on variables vs const. it will warn though that the match arm below it is entirely unreachable.
It’s because the thing on the left hand side is a “pattern”. If the pattern is a constant (for example, “12”), then this is a comparison: the pattern matches if the thing on the right hand side is equal to 12.
But if the pattern is the name of some variable (for example “foo”) then this becomes an assignment instead: the pattern always matches, and the variable is assigned the matching value. So the semantics depend crucially on whether the pattern is the name of a constant or of a variable, and that’s not necessarily obvious at the site of the match expression.
That raises the question: Is otherwise a keyword or just a convention?
I don’t think it’s even convention, otherwise is just a variable name the author chose to use. If anything I’d say it’s convention to use a semantically meaningful name.
If you have a catchall case where you don’t need to bind the value to a variable, you use _, which will communicate that to both the reader of the code and the compiler (and effects borrow checking and drop order).
Out of curiosity, can Rust editions change certain warnings to errors? I wonder if unreachable code could ever become a hard error. I read a couple of statements that editions are mostly about syntax changes.
I’m not sure they’d make unreachable code a hard error though since it triggers when you have a todo!() in the middle of a function which is useful when developing
Yeah. You can even make it an error yourself with #[forbid(unreachable_code)] or put #![forbid(unreachable_code)] on the top of your lib.rs/main.rs, or under [workspace.lints.rust] in your Cargo.toml, put the line unreachable_code = "forbid".
There’s a lot of ways to configure the linter, you can change it’s behavior pretty much anywhere. Only thing it’s missing is some sort of “reset” pragma like C#’s nullable reference types.
I’ve written more Rust than I care to divulge, but I’ve only hit a bug related to this issue once: and it was entirely because it was a hobby project in which I was ignoring warnings. If I’d checked the warnings, it would have been obvious when I wrote it.
Replace “Rust” with almost any other language and post it in a Rust forum and I promise you that this argument will be hammered down quicker than you can say “borrow checker”.
Note that the issue you linked has an accepted proposal to completely eliminate the footgun from the language. No lint, no naming-convention-based workaround. According to the milestone assigned to the issue, that’s expected to land in 0.15.0, or earlier if a contributor steps up and does the work before the core team gets around to it.
I can’t help but notice that it was at one time or another also part of the 0.7, 0.8. 0.9, 0.10, 0.11, 0.12, 0.13, and 0.14 milestones… has something changed with the project management to make the milestone fields a more reliable indicator?
Either way, I don’t want to take away from the main point that it’s a known issue in a pre-1.0 language with a planned and principled fix.
No not really. The amount of issues solved per release is fairly consistent however. I use those milestones to make sure and visit each issue before tagging the respective release, even if the issue ends up being postponed. That said I have been more aggressively using the “unplanned” milestone lately.
To be clear, if you ever run into this, there will be several explicit warnings or errors telling you things like:
Pattern matches are conventionally snake_case, consider fixing this
Constants are conventionally SCREAMING_SNAKE_CASE (I forget what Rust calls it), considering fixing this as well
Some cases aren’t covered by the match expression/pattern, if you use a constant in a match where you thought you were using a pattern
Some cases are unreachable, if you use a pattern where you thought you were using a constant.
The result is that it’s very difficult to run into this accidentally, and if you do, the compiler will do a good job of pointing out where the problem is and what might need to be fixed.
I find it funny (this remind me of the old “for the lulz” internet I liked as a teenager), and sad at the same time (the web was designed to link documents, referrer hijaking is usually used by jerks, such as newspapers or science journals putting paywalls when it was linked from the wrong place.)
Yep, that’s the one. When I first saw it I thought it was kind of childish and crass, but then again, so is much of the discourse on the orange site, so perhaps it’s not entirely inappropriate.
What’s great about rust is that like most of the confusing things here don’t actually compile- as the author says. :)
Just a minor correction: they compile by default, but they also warn by default (link, link). So, pay attention to warnings :)
I appreciate how “lint everything and have zero warnings” is more or less the default position on projects after decades of ignoring compiler warnings.
Just that alone + “unused binding” warnings prevent typos from being an issue in a major sense. And on the other side of that, being able to refer to constants in pattern matching sounds useful! In theory at least. In practice I’m often finding myself only doing pattern matching as a “last resort” because there’s often some helper function that more directly gets me where I need to be.
Rust 0.6 used to require a
.
suffix on enum variant names:https://github.com/rust-lang/rust/commit/04a2887f879
and here’s a 2013 bug about the ambiguity: https://github.com/rust-lang/rust/issues/10402
This is a tricky corner of any language that has pattern matching. In a pattern, you naturally want a concise syntax for:
Binding a new variable to the matched value.
Matching if the value is equal to the given named constant.
The ideal syntax for both is simply a bare identifier, but then you’ve got ambiguity. Every language with pattern matching has to resolve that somehow. Rust’s approach is to make the language non-context-free: if an identifier in a pattern resolves to a constant, then it’s 2, otherwise it’s 1. That’s the footgun the article is about.
Some other ML-derived languages make identifier case significant in the language. Variables must be lowercase and constants must be capitalized.
I think there are a few that simply don’t support (2) at all. They only support literal constant patterns like
3
and type constructor patterns likeSome(...)
.Swift does a clever but somewhat complex thing. It has two difference syntaxes for patterns in refutable and irrefutable contexts. In a refutable context, an identifier is always a constant (2). If you want to bind a variable you have to do so explicitly using
let
orvar
. In an irrefutable context, you can’t compare to constants at all (since that’s refutable) so (2) is meaningless and an identifier always binds a new variable.The result is that patterns are nice and terse for their two uses: When declaring multiple variables using a pattern, there’s just one leading
let
orvar
like:And in a switch statement where checking against constants is more common than destructuring, you just use the name of the constant and it does what you want.
The downside is that users have to understand that the way to write a pattern depends on where it appears. It’s fairly complex and subtle and trips users up.
When we added pattern matching to Dart, we borrowed Swift’s approach. We already had switch statements where a bare identifier behaves like a constant. And, of course, we already had variable declarations where an identifier names the new variable. So to add patterns in both of those contexts without breaking all the world’s existing Dart code, we felt the best approach was to have a contextual syntax like Swift.
It seems to work out pretty well, though it’s more subtle than I’d prefer.
I lost it at the
GOD => println!("input was 1"),
step. Why is there no error since GOD does not exist?(No pun intended)
GOD is bound as a variable, just like the match arm below it. Rust does not enforce any casing rules on variables vs const. it will warn though that the match arm below it is entirely unreachable.
It’s because the thing on the left hand side is a “pattern”. If the pattern is a constant (for example, “12”), then this is a comparison: the pattern matches if the thing on the right hand side is equal to 12.
But if the pattern is the name of some variable (for example “foo”) then this becomes an assignment instead: the pattern always matches, and the variable is assigned the matching value. So the semantics depend crucially on whether the pattern is the name of a constant or of a variable, and that’s not necessarily obvious at the site of the
match
expression.Now I got it. Thanks!
That raises the question: Is
otherwise
a keyword or just a convention? Seems to be just convention:This prints “Behold: 5”.
I don’t think it’s even convention,
otherwise
is just a variable name the author chose to use. If anything I’d say it’s convention to use a semantically meaningful name.If you have a catchall case where you don’t need to bind the value to a variable, you use
_
, which will communicate that to both the reader of the code and the compiler (and effects borrow checking and drop order).otherwise
makes me think the author likes Haskell (see guards, guards!)@gpm is correct,
otherwise
is definitely not a convention. In fact, if you don’t use theotherwise
variable, you get an unused variable error!Out of curiosity, can Rust editions change certain warnings to errors? I wonder if unreachable code could ever become a hard error. I read a couple of statements that editions are mostly about syntax changes.
Yeah, here’s an example from the 2021 edition
I’m not sure they’d make unreachable code a hard error though since it triggers when you have a
todo!()
in the middle of a function which is useful when developingYeah. You can even make it an error yourself with
#[forbid(unreachable_code)]
or put#![forbid(unreachable_code)]
on the top of yourlib.rs
/main.rs
, or under[workspace.lints.rust]
in yourCargo.toml
, put the lineunreachable_code = "forbid"
.There’s a lot of ways to configure the linter, you can change it’s behavior pretty much anywhere. Only thing it’s missing is some sort of “reset” pragma like C#’s nullable reference types.
Do people commonly set lint rules to warnings locally and errors in CI?
Yes. We have pretty default warning configs. But in CI we run clippy with earnings failing CI.
Yes, but only if the toolchain version is pinned since new lints are added often
I am starting to look more favorably towards Zig and their goal of having as few surprises as possible.
I’ve written more Rust than I care to divulge, but I’ve only hit a bug related to this issue once: and it was entirely because it was a hobby project in which I was ignoring warnings. If I’d checked the warnings, it would have been obvious when I wrote it.
Replace “Rust” with almost any other language and post it in a Rust forum and I promise you that this argument will be hammered down quicker than you can say “borrow checker”.
Hey, at least Rust’s footguns have lints. https://github.com/ziglang/zig/issues/5973
Note that the issue you linked has an accepted proposal to completely eliminate the footgun from the language. No lint, no naming-convention-based workaround. According to the milestone assigned to the issue, that’s expected to land in 0.15.0, or earlier if a contributor steps up and does the work before the core team gets around to it.
I can’t help but notice that it was at one time or another also part of the 0.7, 0.8. 0.9, 0.10, 0.11, 0.12, 0.13, and 0.14 milestones… has something changed with the project management to make the milestone fields a more reliable indicator?
Either way, I don’t want to take away from the main point that it’s a known issue in a pre-1.0 language with a planned and principled fix.
No not really. The amount of issues solved per release is fairly consistent however. I use those milestones to make sure and visit each issue before tagging the respective release, even if the issue ends up being postponed. That said I have been more aggressively using the “unplanned” milestone lately.
The load bearing part is not the milestone, but “accepted proposal”. The issue didn’t have accepted solution before!
At least this isn’t a memory safety surprise, and is extremely rare to encounter (you have to ignore numerous warnings).
I’ve been writing rust for ~5 years, and today I learned about
let pattern = value else { ... }
thanks to this story. I’m a little ashamed of myself…let else is relatively new, only a couple of years
I keep getting told to add a semicolon after the else block!
Ah! That make sense now :) . This was released exactly two years ago and one day. Thank you for clarifying this, I thought I was crazy.
It was new to me as well.
Good find. Every language has foot guns, rust is not an exception.
I can say I never actively thought about this up until when I read this.
This is one of those language features that ends up being a black hole for developer time.
To be clear, if you ever run into this, there will be several explicit warnings or errors telling you things like:
snake_case
, consider fixing thisSCREAMING_SNAKE_CASE
(I forget what Rust calls it), considering fixing this as wellThe result is that it’s very difficult to run into this accidentally, and if you do, the compiler will do a good job of pointing out where the problem is and what might need to be fixed.
The real issue seems like matches binding to variables if they’re not types. I’ve hit that many times.
Congrats on being approved by the orange conversion bot
This is just a repost bot that posts everything from lobste.rs. The guy is running some automation SaaS and is using it to promote his services on HN.
I think it’s worth banning their crawling IPs from lobste.rs simply out of spite. :P
i take advantage of that guy’s bot to give myself an extra upvote on things i would otherwise post to hn..
I’m pretty sure the repost service uses the RSS feed.
And what, the RSS feed is not retrieved by the reposter’s service? Not sure what you’re trying to say.
The more I see how things are posted on orange site, the more I start to think jwz had the right idea there…
I had to look it up to find the context on this. TIL: https://news.ycombinator.com/item?id=14840839
I find it funny (this remind me of the old “for the lulz” internet I liked as a teenager), and sad at the same time (the web was designed to link documents, referrer hijaking is usually used by jerks, such as newspapers or science journals putting paywalls when it was linked from the wrong place.)
Yep, that’s the one. When I first saw it I thought it was kind of childish and crass, but then again, so is much of the discourse on the orange site, so perhaps it’s not entirely inappropriate.