Re: The Case for Rust (in the base system)

24

doomslug 1 year ago

Assuming this would focus more on Rust, I did not expect the ending discussion on Lua.

Finally, one of the key things that we found was that a lot of projects used C/C++ out of inertia. They don’t have peak memory or sub-millisecond-latency constraints and could easily be written in a managed language, often even in an interpreted one. We have Lua in the base system. I’d love to see a richer set of things exposed to Lua …

I’d love to see a default that anything intended to run with elevated privilege is written in Lua.

Maybe it’s because I’m fond of Lua, but I really love the idea of using a managed/interpreted language in tandem with a more traditional systems language at the OS level. I did not realize that FreeBSD already does this in some places. I was curious and did a quick search. Turns out this is referred to as flua and has been discussed on lobste.rs in the past. I’m definitely looking more into this.

Whether or not Rust is useful for FreeBSD now or in the future, I appreciate alternative or supplementary ideas like using Lua.

11

kornel 1 year ago

Facebook has effectively forked Rust because their (huge) Rust codebase doesn’t build with newer compilers.

I’m surprised they did it. Stable Rust is IMHO fully functional now, and nightly features are mostly nice-to-haves.

34

endsofthreads edited 1 year ago

Facebook has effectively forked Rust because their (huge) Rust codebase doesn’t build with newer compilers.

I’m surprised they did it. Stable Rust is IMHO fully functional now, and nightly features are mostly nice-to-haves.

(Hi, I’m an engineer who supports Rust at Facebook, but I don’t speak for the company or the rest of the team.)

I don’t think that the original statement is entirely accurate, except in the most narrow sense of Facebook’s rustc 1.75 not being bit-for-bit identical with rustup-distributed rustc 1.75. Facebook builds rustc from source and will carry a few patches that range from setting jemalloc as the default allocator on Linux to supporting older versions of LLVM for C++ LTO reasons. While “forked” isn’t wrong on a technical basis, I think connotations of that phrasing can be a bit misleading.
1. 12
  
  skade 1 year ago
  
  This is where I find the phrase “downstream” actually meaningful. Downstreams may carry patches for various reasons and various times, but they are never meant as a substantial divergence from upstream.
13

briankung 1 year ago

dtolnay disputes this on reddit:

It is not accurate. The Facebook monorepo’s Rust compiler has been updated promptly every 6 weeks for more than 7 years and 54 Rust releases, usually within 2 weeks of the upstream release. Source: many of those updates were done by me.

https://old.reddit.com/r/rust/comments/19dtz5b/freebsd_discusses_use_of_rust_in_the_base_system/kj8qfso/?context=1
6

BenjaminRi 1 year ago

Does anyone have more information about this? A cursory search did not reveal any details about a Rust fork by facebook. I found an article about Rust at facebook from 2021, which is very interesting. They do mention a team that is responsible for toolchain rollouts and the Rust ecosystem at facebook in general, but they do not mention a significant fork that is being maintained without being upstreamed.
1. 3
  
  5d22b 1 year ago
  
  I’m reminded of Facebook being questioned in the US Congress about why they were using unstable Rust,
2. 2
  
  ceph 1 year ago
  
  I hadn’t heard anything about a fork of the compiler per se, but I have heard from Meta engineers that they are somewhat separated in an ecosystem sense internally due to eschewing Cargo entirely for Buck2 (in this case, it was a recent lightning talk at Rust NYC about using Buck2 with rust-analyzer).
  1. 8
    
    endsofthreads 1 year ago
    
    hi, that was me giving that talk/demo! while it’s less separated than you might otherwise think due to the relative ease of using reindeer to import libraries, you are not wrong that using buck is a departure from most of the Rust community.
    1. 3
      
      ceph 1 year ago
      
      Thanks for the note - figured I’d name drop the event just in case any misrepresentations were present.
4

rtpg 1 year ago

Was FB super early adopters to Rust? Seems like a huge cost to pay compared to, say, using all that time forking Rust to instead just do the porting work.
3

moltonel edited 1 year ago

That surprises me too. Does anybody have a good source backing that claim ? What version of Rust is Facebook stuck at ? This fb engineering post doesn’t hint at a forked Rust toolchain.
1. 2
  
  Vaelatern 1 year ago
  
  I suspect the author had personal conversations with people who would know, and did not rely on public blog posts to get this information.
  1. 2
    
    moltonel 1 year ago
    
    And/or he misunderstood what he read/heard. If FB was stuck on an old Rust release with private patches, you’d expect them to mention it when they talk about their Rust experience. FB’s public Rust projects, like diem or sapling, don’t require and old or patched toolchain.
    1. 9
      
      david_chisnall 1 year ago
      
      I am too tired / ill to find it now, but it came up in the discussion on the Rust list about CHERI. Facebook initially objected to the strong provenance model for Rust because it was incompatible with their internal codebase. They then admitted that so were existing provenance-based analyses and they’d been carrying a bunch of patches to support terrible Rust code.
      1. 3
        
        moltonel 1 year ago
        
        Thanks for these pointers. Rust support for CHERI is still a work in progress and not mainlined, and provenance analysis remains a hot topic in LLVM. So naturally anybody exploring these is using a patched toolchain. That’s very different from the “FB’s Rust codebase doesn’t build with recent compilers and they have to maintain a fork” claim, which looks like dramatization to me.
        
        PS: Get well soon :)
        
        4
        
        david_chisnall 1 year ago
        
        You misunderstand. The proposal was for provenance in Rust to be part of the guarantees for unsafe (i.e. unsafe code is not allowed to materialise pointers from thin air). This is enforced by CHERI but is orthogonal and improves alias analysis from the compiler if pointers that have never been reachable from unsafe blocks can be assumed never to be alias memory that an unsafe block touches. Facebook pushed back on that because it would break their code. There were then comments that the compiler already made some similar assumptions and this would break code that did the things the Facebook people were worried about breaking. They then admitted that their internal fork of the compiler had had to diverge because the language already broke the (deeply terrible) things that they were doing. Maybe they’ve stopped doing those things now (this was a couple of years ago), but at the time they suggested that they had millions of lines of code that all did deeply suspicious things with unsafe.
        
        14
        
        skade edited 1 year ago
        
        I’ve been part of that discussion and that was not what I read out of that.
        
        The problem is that provenance is in the territory of “we kind of punted on pointer semantics to get the language out of the door”.
        
        As you say, facebooks pushback to these changes was that it would make a bunch of things that were previously informally legal (in the sense that they were never discussed and the compiler did rely on them) undefined, which may lead to the compiler exploiting them differently. The pushback was that they may need to carry patches in case the compiler starts exploiting different semantics here to restore the old ones.
        
        However, they did not admit to anything there. facebook has a compiler and tooling group like any major FAANG company and their internal compiler my diverge at some points for a few weeks. That compiler team was back then lead by one of the persons that was actually part of the language stabilisation, so they had quite some knowledge about pointer semantics.
        
        I can also share that the threat of forking the compiler as a discussion item was considered a bit… not par for the course.
        
        I’ll add my piece of spice though: My main complaint in any of those stability discussions about Rust backward compatibility is that the language semver RFC actually has a little backdoor called “Underspecified language semantics”. https://rust-lang.github.io/rfcs/1122-language-semver.html#underspecified-language-semantics
        
        Particularly, it says:
        
        Over time, we expect to be fully defining the semantics of all of these areas. This may cause some existing code – and in particular existing unsafe code – to break or become invalid.
        
        So, if something underspecified… all bets are off :). This happens supremely rarely, but in the early few releases of rustc, there were quite some changes with low impact that were strictly compat-breaking for that reason.
        
        My last piece of spice is that everyone talks about pointers as if we settled on what their exact semantics are :). Like, across all languages.
        
        4
        
        ralfj 1 year ago
        
        It is worth noting that the change Facebook was concerned about was a change nobody actually wanted to do. There was just a whole lot of miscommunication in that thread. The summary is that we’re not touching the status of int2ptr casts – they remain an underspecified construct that is needed for some work but not captured by the models and tools that let us reason about Rust with high confidence. (This is the same status that they also have in C and C++.) We will be adding more ways to avoid using them, and encouraging people to try those new ways, but the old casts are not going anywhere.
        
        It is unfortunate to see that old miscommunication lead to more miscommunication to this day :( . I hope this can be clarified so people don’t go around thinking Facebook is stuck with some old Rust version or anything like that.
        
        There were then comments that the compiler already made some similar assumptions and this would break code that did the things the Facebook people were worried about breaking.
        
        This is correct but misleading. The issues that the comments referred to are long-standing bugs in GCC and LLVM, affecting C, C++, and Rust alike. Sticking to old Rust, or C or C++, won’t help avoid these problems.
        
        My last piece of spice is that everyone talks about pointers as if we settled on what their exact semantics are :). Like, across all languages.
        
        We are getting there :)
        
        1
        
        skade 1 year ago
        
        Hi @ralfj and welcome to lobste.rs! Thanks for the additional notes and views, it’s such a complex thread to summarise that every viewpoint helps!
        
        My last piece of spice is that everyone talks about pointers as if we settled on what their exact semantics are :). Like, across all languages.
        
        We are getting there :)
        
        I love seeing that and I’m happy about the work! Just to be clear: I had no intention of criticising the work being done here - which is awesome - finally people are caring about that corner.
        
        2
        
        ralfj 1 year ago
        
        Thanks. :)
        
        I love seeing that and I’m happy about the work! Just to be clear: I had no intention of criticising the work being done here - which is awesome - finally people are caring about that corner.
        
        No worries, I took your statement more as advertisement for our work since it pointed out that work is much needed. :)

7

sjamaan 1 year ago

Some of these findings are pretty damning for Rust, given how high and mighty everybody likes to sound about Rust’s safety guarantees. If reasonably well-written C++ is safer than Rust in many cases, I doubt it’s a good idea to “experiment” with Rust for FreeBSD. It would be a high cost to pay for this particular project if it turns out in the long run that it doesn’t help much. They have manpower problems enough as it is.

24

david_chisnall 1 year ago

If reasonably well-written C++ is safer than Rust in many cases

I don’t think it is, in most cases. There are corner cases around interfacing with things outside of the abstract machine where the C++ code will be safer than naive Rust, but in most cases Rust code will be as safe or safer than well-written modern C++.

They have manpower problems enough as it is.

I think this is where the economics gets interesting. There are a lot of enthusiastic Rust programmers. There are very few people under 40 that want to write C code. Probably more that want to write C++ code, but few that are as enthusiastic as the ones that really want to write Rust. It may be that the pool of Rust programmers that want to contribute to an open-source OS is larger than the total number of current FreeBSD developers.
1. 5
  
  sjamaan 1 year ago
  
  in most cases Rust code will be as safe or safer than well-written modern C++.
  
  Ah, then I sort of misunderstood your take-away point in the mail. Thanks for clarifying!
  
  I think this is where the economics gets interesting. There are a lot of enthusiastic Rust programmers. There are very few people under 40 that want to write C code. Probably more that want to write C++ code, but few that are as enthusiastic as the ones that really want to write Rust. It may be that the pool of Rust programmers that want to contribute to an open-source OS is larger than the total number of current FreeBSD developers.
  
  But it would be risky to switch to Rust just to attract new potential contributors if it means alienating the existing contributor base (most of whom are probably over 40 and well-versed in C). Especially if it means increased churn, higher maintenance burden and if it all turns out to be a flash in the pan it’s wasted effort.
  1. 9
    
    david_chisnall 1 year ago
    
    But it would be risky to switch to Rust just to attract new potential contributors if it means alienating the existing contributor base (most of whom are probably over 40 and well-versed in C).
    
    Many are over 40, and that’s part of the problem. The project has an aging contributor base. At the moment, I think we’re still attracting new ones at replacement rate, but the number that are dying or leaving due to burnout is high. I’ve not contributed much for quite a long time (I am over 40, for what it’s worth), in part because most of the things I’d want to improve are C. At this point, C is basically the subset of C++ that you should actively avoid when writing C++. I write code with a much lower error rate and get a lot more done per line of code in C++ than C and so it’s difficult for me to motivate myself to write C.
  2. 5
    
    singpolyma 1 year ago
    
    But it would be risky to switch to Rust just to attract new potential contributors if it means alienating the existing contributor base
    
    This is a tricky one. Because of course you’re right, losing all existing contributors would be a disaster. And if no one new comes anyway even worse.
    
    But, given some time, getting no new contributors is an even more permanent disaster.
    
    So, some kind of balance of priorities is needed (I’m not saying rust in this project is definitely on one side or other of the balance, I don’t know enough of the contributors to have a stake there).
2. 4
  
  Ada_Weird 1 year ago
  
  Can I just say that it’s an amazing part of this community that a third party can repost a technical analysis and there’s decent odds the writer will see it and be available in the comments to clarify points? It’s really something special and one of the things that the Internet really excels at.
18

zesterer edited 1 year ago

I don’t see anything here that’s ‘damning’ for Rust. The criticisms seem to boil down to a few key parts, several of which are misunderstandings by the author:

First, it’s quite hard to find competent Rust developers.

It’s a new language. Not surprising and not fundamental.

Neither Rust nor C++ guarantee safety

A lot of what’s being said here seems to be based on several key misunderstandings of what Rust is trying to achieve and why. No language can ‘guarantee safety’ by the overly specific definition of here: even superficially safe languages like JavaScript are still relying on the runtime that executes them, written using unsafe code, to not do something nasty accidentally. The only difference with a language like Rust is that it’s made the decision to move that unsafety out of a complicated and difficult to audit runtime and into an easier to audit standard library. This is a safety win. What’s important here is that unlike C++, Rust has a well-defined and easy to discriminate safe subset (i.e: it’s easy to tell what does and does not need more careful attention during a code review).

There was a paper a couple of years ago that found a lot of vulnerabilities from this composition.

I’m not sure what’s being referenced here, but if two “safe” abstractions do not compose safely, then at least one of them is, in actual fact, unsafe. Rust has pretty rigorously ironed out a lot of the kinks in its safety model now, and it’s even been formally proven that composition retains safe properties.

Moving the check into the unsafe block fixed it, but ran counter to the generic ‘put as little in unsafe blocks as humanly possible’ advice that is given for Rust programmers.

This, too, is a fundamental misunderstanding. Rust’s safety guarantees are semantic, and when you take it upon yourself to read raw bytes and interpret them as well-formed values in the Rust type system, checking that they are well-formed is the unsafe code that needs careful audit. What would be the point of an unsafe block if it didn’t actually demarcate the condition that enforces correct behaviour? It seems like the author believes that an unsafe block can be read simply as a ‘hold my beer’.

When I looked at a couple of hobbyist kernels written in Rust, they had trivial security vulnerabilities due to not sanitising system call arguments.

I think this is a reasonable criticism of the kernels in question, but not of Rust: to even invoke (or receive) a syscall to begin with, you’re moving outside the domain of Rust’s safety model (and this must be explicitly annotated as such with an unsafe block). Ergo, it is your responsibility to ensure that whatever invariants that are required to maintain safety are maintained. If you want a language that can automatically understand what a syscall is and what invariants it has without first telling the language what they are… well, you’re going to be waiting for a very long time. Even theorem provers still need you to state your priors.

–

If there is a valid piece of information to pull out of this, it might be “the promises that Rust makes do not necessarily align with the most common safety issues present in kernel code”. I think that’s a much more reasonable claim to make, although it still does fly in the face of a lot of existing research about memory safety.
1. 3
  
  masklinn 1 year ago
  
  if two “safe” abstractions do not compose safely, then at least one of them is, in actual fact, unsafe
  
  Strictly speaking, this should be “unsound” rather than “unsafe”. In the context of Rust, safe/unsafe is an API-level guarantee (or lack thereof), if a safe API does not uphold its requirements, it’s unsound.
  1. 2
    
    zesterer 1 year ago
    
    There’s a crowd within the Rust community that considers unsafe to be poorly named and would prefer something like unchecked. I’m not sure I’ve decided either way.
    1. 2
      
      masklinn 1 year ago
      
      I mean sure but that’s not really relevant.
      
      A safe rust API which does not uphold the requirements of its unsafe callees is unsound, not unsafe, even if you decide to call unsafe something else.
7

moltonel 1 year ago

People in that thread are not shy about pointing out Rust’s imperfections, but I don’t see any of them claiming that “well-written C++ is safer than Rust in many cases” ? At best, the linked post gives a fairly niche (RTOS on special hardware) example where C++ was as safe as Rust.

The point of experimenting is that you don’t put all your resources in it, and the consensus on that thread seems to be that they’ll introduce Rust at the edges, without committing fully yet. The hope is that Rust will actually help their manpower issues, by being a more productive language and by attracting new contributors. Worth a try.

7

inactive-user edited 1 year ago

In terms of urgency I never understood the push to port coreutils-esque cli utilities to Rust (e.g. cat, less, ls, rm, etc.) Those have a pretty low attack surface and the threat is sort of remote (manipulating a malicious file or filename on your file system).

On the other hand network servers seem relatively more urgent. More urgent than network servers is the kernel itself. I think that’s where the conversation should trend.

The inertia of C/C++ is incredible. For some reason Rust still seems too experimental to strongly suggest that people in FreeBSD commit to start rewriting their kernel in Rust. We should compare this situation to the maturity of C when the UNIX kernel was rewritten in C: C was much less mature than Rust is now when that happened. There are a lot of reasons for the difference in perception but I think the major one is that the author of the UNIX kernel was the same person as the author of the original C compiler. Using the new language felt less risky because anything that would have been needed from the language would be relatively quick to add.

Maybe once Rust adoption in the Linux kernel goes exponential, the perception of risk will evaporate and FreeBSD will follow soon after. It will be interesting to see what NetBSD and OpenBSD do after that. Very interesting times in the world of software.

12

moltonel 1 year ago

When looking at benefit/cost ratio, small cli utils have a low cost, many of them started as small side-projects. The benefit is also much more than just safety, for example some people seem more interested in productivity, or unit-testing techniques that are a PITA in C. Projects like Ripgrep show that even rewriting mature and ubiquitous tools can end up being successful and much better than the original.
1. 2
  
  viraptor 1 year ago
  
  Also if you want to make a small change to something… GNU coreutils are not amazing for that. I’d honestly rather implement a small utility that does just the new thing than touch any of their sources.
  
  I can understand/change the new rust implementations just fine though.
7

lattera 1 year ago

In terms of urgency I never understood the push to port coreutils-esque cli utilities to Rust (e.g. cat, less, ls, rm, etc.) Those have a pretty low attack surface and the threat is sort of remote (manipulating a malicious file or filename on your file system).

When learning a new language, I will sometimes rewrite common utilities. Doing this allows me to learn how to work with the filesystem and the OS. Writing something as simple as ls(1) can teach me about iterators, recursion, safe filesystem access, string formatting, etc.

Sometimes writing code is less about urgency of safety and more about curiosity.
5

Diana 1 year ago

Testability and ability to manage multiple targets are the characteristics of Rust that are attractive for these utilities case. No memory safety. C is… Highly problematic for these.

6

5d22b 1 year ago

One project that I worked with, for example, … read a value from an MMIO register into a variable typed as an enumeration. Outside of the unsafe block, it then checked that the value was in range. Rust enumerations are type safe and so the compiler helpfully elided this check.

This is an anecdote I remember david_chisnall relating on Lobsters.rs. I imagine the unsound MMIO code was using something like ptr::read::<SomeEnum>. While I almost entirely avoid writing unsafe Rust, I know one must be careful about upholding the validity invariants of types if one does write unsafe Rust, so I would read data from outside the program (where they’re not really typed) as just byte(s) and then convert them into a richer type with a normal constructor/conversion function in safe code.¹ Although there doesn’t appear to be one yet, I imagine it shouldn’t be too difficult (for people who know how) to write a lint in the Clippy linter to disallow ptr::reading into types (such as enums) that have nontrivial validity invariants.

¹ I would expect this to be a zero-cost abstraction if the enum variants are defined with the corresponding integer values, as they presumably would be if one had been ptr::reading them.

3

ralfj 1 year ago

Yes, Rust and C enums do not work the same way, so there’s a potential footgun here.

The text goes on saying that the issue disappeared when the check was moved into the unsafe block. That shows a misunderstanding of how Undefined Behavior works – moving the check inside the unsafe block changes absolutely nothing, except the compiler may be optimizing things a bit differently. (Or maybe the change was larger than just literally moving the check into the block, it’s not clear from the text.)

The fix is to do the load at integer type, check, and only then convert things to an enum. (Or likely, do the checking and conversion as one operation.)

5

ralfj 1 year ago

Neither Rust nor C++ guarantee safety.

While this is true in the strictest sense of the word “guarantee”, it is comparable to saying that neither freeclimbing nor proper belaying guarantee that you will survive the climb.

Rust has a huge lead on C++ when it comes to safety: Rust has safe abstractions. For instance: The implementations of Vec in Rust and of std::vector in C++ are very comparable, and they are equally unsafe – someone has to get that code right. But using Vec in Rust is entirely safe, there is no way to accidentally push while iterating or any of the other ways that the std::vector API can be misused. So you have to carefully audit a few thousand lines of vector implementation, but in Rust you do not have to audit the millions of lines of code that use the vector type. Even the latest C++ and state-of-the-art static analyses are a far shot from that kind of safety guarantee. If you disagree, please provide a reference of a static analysis that (a) accepts enough code that it passes on real-world code bases, and (b) can be formally proven to not miss any bugs in std::vector use.

A different way to put it is that Rust and its standard library have been designed from the start to be statically analyzable, and C++ has not, making static analysis of C++ orders of magnitude harder.

Safe Rust does guarantee safety. There is no “safe C++” with a similar guarantee.

2

5d22b 1 year ago

Neither Rust nor C++ guarantee safety.

While this is true in the strictest sense of the word “guarantee”, it is comparable to saying that neither freeclimbing nor proper belaying guarantee that you will survive the climb.

From your username and your having been invited by a verified Rust team member, I take it that you are Dr Ralf Jung, Rust operational semantics expert and rock climbing enthusiast? :-) Welcome to Lobste.rs!
1. 2
  
  ralfj 1 year ago
  
  Yes I am. :) Thanks!

3

anacrolix 1 year ago

Noice