The value of crates.io is that there’s no curation, so anyone can upload their base64 decoder or x509 certificate parser and have them hosted at no charge on reliable infrastructure, with auto-generated documentation.
Putting a random grab-bag of libraries into a big source control repository and calling it stdx isn’t something that’s going to help, because conglomerated mega-libraries have existed for longer than TCP/IP and people still can’t agree on which regex syntax libc should implement. I don’t want to go back to the idioms of C programming when libraries like glib provide implementations of base64 and XML and command-line flag parsing and thread pools. There’s nothing wrong with small libraries, it’s the organization (/ categorization) that’s the problem.
If the author wants something like a Rust equivalent of Go’s standard library then it seems unlikely to ever happen, because the Rust core devs don’t want to be in the business of writing a tar codec. If they want something like the golang.org/x/ packages, then that’s a problem with crates.io rather than Rust stdlib – there is no technical reason we can’t have an officially-maintained SHA-256 implementation in a rust-lang.org/x/crypto/sha256 package.
Rust developers are stuck in an endless hamster wheel where every month / week there is a new best way to do something, and the previous way is now deprecated, kind of like in the fronntend development world with the weekly hottest JavaScript framework.
The time has come for the author of this post to learn that chasing the newest, most Discord-approved library is a futile and burnout-inducing endeavor.
Don’t worry about which libraries are newest. libpng has been compressing lossless cat pictures for almost 30 years and it’s still a reasonable default choice if you want to work with PNGs.
Don’t worry about social proof or popularity when choosing a library. The only vote that matters is the computer’s. If you’re worried that a library you depend on might vanish from the internet in 10 years, then make a copy of it. If you’re worried it will become unpopular, then stop worrying about that.
If you care about being “best” then you’re always chasing a moving target. Frontend is full of moving targets because many people publishing to NPM care about how many downloads their packages get, so they advocate for people to use their libraries by claiming to be the “best” at manipulating the DOM or whatever. Do you think people writing firmware spare any time to wondering if their i2c library is “the best”?
Try to be the Rust developer who comes over from C and still writes documentation in nroff because they don’t see any reason not to. Don’t be the Rust developer who comes over from JavaScript and tries to “follow the latest news” about crates.io libraries as if it’s some sort of sports league.
I’ll second the namespacing issue. I really don’t understand cargo’s reasoning for going with a flat namespace here. At least you can use your own repositories and git repos as crates if you want, though the experience is strictly worse than how Go handles it.
Maybe it’s time for an alternative cargo ecosystem.
That being said, Go also has a massive company backing it and is rather precise in what they include or change in their stdlib and extended stdlib.
It just needs to be implemented. There’s very few people who care enough and the cargo team is thin-stretched. A second ecosystem will not solve this. It’s not like cargo doesn’t accept contribution, it’s that no one is bringing the contribution.
That’s not an RFC for namespaces, it’s an RFC for package name prefixes.
Why should someone who wants namespaces contribute by implementing a feature that they don’t want, and is a poor substitute for the feature they do want?
Namespaces won’t solve it, because they tell you nothing about security or quality of the packages.
You want think they do, because you expect to be able to infer that from the owner, but this has lots of pitfalls:
Projects can change ownership without changing the namespace! This happens on GitHub and npm all the time. “JiaTan”s are given access to the existing project, and it isn’t moved to another namespace, because that’s disruptive to users.
Reputable official namespaces can contain obsolete garbage (batteries included always end up leaky). This is already the case with some rust-lang-owned crates.
The official namespace does not give the same security guarantees to everything under it. I’ve explicitly asked Rust project members about this – development practices vary from project to project, and some are just pet projects someone happens to own under rust-lang.
I’ve also been told off by Debian maintainers for insinuating that packages included in Debian had their code reviewed for supply-chain-security issues. They check licenses, and ensure the code works, but they can’t promise checking for backdoors (that’s entirely reasonable, because security-focused code review is painfully laborious and difficult).
In a large ecosystem you will end up having to pull deps from many namespaces, and you will not know how reputable they are. There are top authors with silly creatively spelled nicknames, and you’re at risk of not noticing typosquatting in the silly names. There are namespaces that sound like owned by well-known trillion-dollar company that happen to be owned by an independent developer who just grabbed a name. There are namespaces that look like from a serious company with headshots of board of directors on their site, that are from an “entrepreneur” coding from mom’s basement.
In the end it all needs to be reviewed and vetted. You don’t have better chances eyeballing namespace names alone.
Namespaces won’t solve it, because they tell you nothing about security or quality of the packages.
Namespaces will solve it, because they are a pre-requisite for writing policies about security or quality of packages, which is a non-technical process done by end-users. I don’t need the crates.io website to tell me if a library is secure or well-written.
Projects can change ownership without changing the namespace! This happens on GitHub and npm all the time.
Doesn’t matter what happens on GitHub or NPM. Whether a given name can change ownership has nothing to do with whether that name is allowed to co-exist with other names.
I can trust that kernel.org serves official Linux kernel source archives, I can trust that gnu.org is a reliable source for GNU Make source archives, I can trust that rust-lang.org is a reliable source for Rust toolchain binaries.
Even today, on crates.io, a crate is allowed to be transferred to other people – this is in fact one of the arguments that people have used against namespaces (because transferring crate ownership would no longer automatically bring along existing users).
Reputable official namespaces can contain obsolete garbage (batteries included always end up leaky).
Doesn’t matter. I’m not using the namespace to determine the quality of code, that’s what dependency code review is for.
The official namespace does not give the same security guarantees to everything under it.
Doesn’t matter – see above. Namespaces aren’t a trust mechanism, they’re a publishing mechanism.
I’ve also been told off by Debian maintainers for insinuating that packages included in Debian had their code reviewed for supply-chain-security issues.
I’m not expecting any packages on crates.io to have been reviewed for malicious code. It’s a package registry, its job is to host source code and provide name=>tarball lookup.
In a large ecosystem you will end up having to pull deps from many namespaces, and you will not know how reputable they are.
This is a matter of user policy. It’s not up to the crates.io team to determine whether any given package is trustworthy, that’s up to dependency review.
Namespaces will solve it, because they are a pre-requisite for writing policies about security or quality of packages, which is a non-technical process done by end-users. I don’t need the crates.io website to tell me if a library is secure or well-written.
i really don’t understand what you’re trying to say here, because on the surface this statement seems blatantly false? if you want to write a policy that you only trust crates published by certain organizations, you can use the crates.io ownership fields to enforce that.
i’m also not sure what you mean by “non-technical”, security audits of dependencies seem pretty technical to me?
if you want to write a policy that you only trust crates published by certain organizations, you can use the crates.io ownership fields to enforce that.
The crates.io ownership field is used for configuring permissions of crates.io packages, it doesn’t actually have any thing to say about who the owners (/ maintainers, developers, etc) of a library are.
I can add other people (including the Rust official teams) to my crates. That doesn’t mean my crates are maintained by the Rust core developers.
I can upload rust-lang/libc under my own account under a false name, that doesn’t make me the owner of https://github.com/rust-lang/libc
Remember, crates.io is just a package registry. It doesn’t have anything to do with security or code quality.
i’m also not sure what you mean by “non-technical”, security audits of dependencies seem pretty technical to me?
Say you work at a small company with ~500 engineers or so. The CTO (been a manager for 20 years) might write a policy describing which third-party dependencies are allowed, the General Counsel will add a section about license compliance, and that policy will be provided as evidence to a SOC2 auditor (usually an accountant).
The policy will have rules about which level of review is required for different cases, how the results of that review are recorded, how much of the code will be mirrored internally and in what format, etc.
The policy might have an impact on how software engineers do their jobs, but the policy itself is not a technical document. It’s unlikely to have anything Rust-specific in it other than rules about which hosting services are OK (“GitHub and crates.io are OK, no personal homepages”), and it might have been written by people who have never written (or read!) any Rust code.
If you’re allowed to say, does Cloudflare review and vet their dependencies’ code? If so, do you think they would be willing to join
Google, Mozilla, et al. in publishing those reviews for cargo-vet (or similar), so that the great labour of reviewing all dependencies can be shared by more well-resourced parties?
(Of course some would say that one shouldn’t trust even Google or Cloudflare to review one’s dependencies for one, but I imagine cargo-vet could grow an option to require some N > 1 third-party reviews to accept a crate that lacks one’s own review.)
I agree, but this mindset only works to a point unless your dependencies are independent and self-contained, which is rare in Rust. Most often each library carries its own little (or big) tree of transitive dependencies and those trees overlap each other. You might have picked two dependencies and wound up shipping three “error” crates, only one of which is maintained any more because your deps and their deps are out there chasing the ideal and this churn gets caught up in updates for security reasons. For programs of any serious size it becomes your problem unfortunately.
While I broadly agree - the constant churn of best practices is exhausting and IMO an extended standard library would be wonderful - I’m not sure this article is bringing a lot to the discussion, particularly the technical aspects of bringing so many libraries into alignment or the politics of choosing. The reference to Adam Harvey’s investigation is misleading. From the referenced report:
Only 8 crate versions straight up don’t match their upstream repositories. None of these were malicious: seven were updates from vendored upstreams (such as wrapped C libraries) that weren’t represented in their repository at the point the crate version was published, and the last was the inadvertent inclusion of .github files that hadn’t yet been pushed to the GitHub repository.
I think it’s just part of Rust’s culture at this point that libraries will be written and rewritten and people will pull whatever they like from crates.io’s immense immutable history. Give it another 10 years and I’m optimistic things will settle down though. (I’m not being flippant, I think this will be a wonderful period for Rust.) In the meantime one has to be nimble.
P.S. Curiously this loads fine on mobile but returns Forbidden on desktop?
Fully agree. I feel like the churn of best practice in Rust is a symptom, not the problem. The “problem” being that designing a cohesive, stable, batteries-included standard library is hard: off the top of my head, scanf in C and Go’s net.IP are some good examples where a standard library can under-deliver, leaving a permanent wart. And in Rust specifically, I think passing by &T/&mut T/T, object safety, Send/Sync, and async also add complexity when trying to figure out how to design an interface. “Churn” is a sign that the interface wasn’t as obvious after all!
As someone who writes Rust both for fun and for $dayjob, I like having small, easily swappable components. It’s definitely annoying when there’s ecosystem churn, but it’s not like my existing dependencies will break (oh, and Cargo even lets you import multiple versions of the same dep, so you don’t even need to upgrade everything at once anyway)
I don’t know Zig well, but calling out its standard lib as more extended than Rust’s seems ingenuous.
The whole article reeks of “if only someone else would solve these problems for me!”. Go ahead, join the libs team. Go write some proposals for the Rust Foundation to fund it. Looking at this blog’s previous articles, nope, they aren’t contributing anything or doing any experiments to solve it, they’ve just been complaining about the same problems for about a year.
… is that an option? I’ve given up on writing ACPs and RFCs because of the extremely long delay before they get reviewed, and I figured libs-team is way undersized (6 people, compared to 15 on compiler-team) because that’s how the existing members prefer it.
It was an option like 6 years ago when I last considered it? Dunno how much the practical mechanics have changed since then. Their zulip seems open to all, and their weekly meetings appear to be as well. My experience with the Rust project in general is that if you participate regularly, ask useful questions, and come up with interesting ideas, people will start to recognize you and pay attention.
yeah, rustc meetings are like a lot like city council meetings in that you can just show up to them. in my experience the only things you get from a membership to a team is the ability to review PRs and vote on certain things.
This nails home the point I wanted to comment: The rust ecosystem was and still is changing by a ton. And whatever you decide to integrate now into std will be the wrong decision in 2 years. Either because the language will allow for easier ways to express the same functionality (async/await, NLLs, const generics) or because the best solution for the problem changed (see the talks first few minutes about error crates, and I experienced this myself - including error-chain).
Now some projects might settle down and ecosystems might slow down, to the point they are stable. Something I would say about tokio (beware of the in-flight io_uring..). Which is good. But at that point you might as well use those stable libraries - why even bother putting them in the std. Especially since my example of tokio might be totally wrong - we already had people talk about different runtime requirements than what tokio offers - and we have definitely other runtimes available, which some people do prefer (smol, monoio, async-std to name a few).
And if you say “tokio is the default, put it into the std”: You can always pin tokio 1.0 in your project (or even a specific hash) and stop worrying. Yes - you can.
I think Sylvain has a point: With each crate in your dependency graph, you need to either trust the author or audit the code. However, this also applies to the standard library, so moving code there doesn’t magically fix the supply chain problem. Same with namespacing, which is also touted as a solution every time this comes up, as if adding an org identifier and allowing for more potential typosquatting would somehow improve supply chain safety.
The solution is doing the work, not hoping that the understaffed and overworked libs team volunteers do it. Besides, the foundation is already working with JFrog to audit crates.io for malicious code. Yet every time the topic comes up, we get the same rallying cries for either crate namespacing, a bigger standard library or both.
I have thought about this reply for way too long. I do understand the higher trust level that the rust org has, as well as the idea of keeping the amount of organizations you need to trust low. (rust org + tokio org makes two already, going with tokio as example). Nevertheless I think it wouldn’t make much of a difference if the folks maintaining tokio would do the same under the rust org, especially when keeping the current development pace.
Thus my feeling is that this is more about the feeling of trust and that such a move wouldn’t really gain any higher security. And if you would lock down the amount of people that can contribute, you would probably starve the project and elevate the burden on the rust org even more.
There is also the aspect of whether the tokio project would be slowed down massively by being just one part of the overall rust org - which obviously has its main priority around the rust compiler and language, overshadowing the tokio project.
The problem is that my argument only holds true for tokio as example. If you count the 300+ crates a typical webserver has, then yeah. You probably would gain something by putting this all under - say - the tokio org, which already maintains axum. Then again, huge parts of those 300 crates are probably from the same area of contributors (axum, http, hyper, tower, tokio, futures)
These are top crates ranked by popularity * quality (where quality is a mix of lots of ranking signals computed from crates’s source, usage history, and pagerank-like author trust).
I think a lot of this is fear and confusion is caused by search on crates.io being… to put it nicely, unopinionated. It doesn’t detect obsolete crates. It doesn’t rank by author reputation. It will give you results by exact name match, even if that’s a v0.0.1 hello world package published 7 years ago.
The problem with opinionation is, of course, the opinions. I wish I could recommend lib.rs for more things, but trying to find crates on lib.rs related to my day job involves occasionally seeing a banner with an inaccurate and political screed against my choice of technology, and incremental improvements to crates.io’s search algo seems better than endorsing that.
This is kind of an unfair comparison. Go has a group of engineers that decide what’s in stdlib who are full-time, long-tenured, and fearlessly assertive, as well as (IMO) smart, experienced, and tasteful. That is an extremely expensive and challenging resource to create and maintain (as OP points out). It also has a fairly narrow intended “sweet spot” of network server infrastructure that makes it easier to decide what belongs in the stdlib. Even if Rust somehow got the first thing, it wouldn’t have the second.
The trouble is you can’t really do this halfway, or the library crates won’t interoperate. I’m reminded of the nightmarish time when you couldn’t load multiple C# libraries in the same process because they would disagree on what Object is. At some point you end up having to draw boundaries (process or otherwise) to be able to use multiple different library “cultures” in one project. (Hm, maybe a “culture” of interoperating libraries could be an explicit thing in crates.)
All other major programming languages and ecosystems have internalized this fact and provide solid and secure foundations to their developers. Go is paving the way, with deno and zig following closely.
The extended standard library, let’s call it stdx, should be independent of the language so it can evolves at its own pace and make breaking changes when it’s required.
The real stdx was a project by Brian Anderson (brson), who was the (de facto?) leader of the Rust Project for some years after Graydon Hoare left.
And yet I’m pretty confident that this codebase will continue to work correctly with any database/sql version which will be released in the next 10 years.
This is not something I would be willing to say about any of the Rust db access libraries. At least this has been my experience working professionally in Rust for the last 4 years.
I’m curious what kind of experience folks who want this have with Java or C++.
Java has a broad standard library, but arguably you aren’t supposed to use it but instead you are supposed to find an Apache Commons library (or, in the case of i18n, ICU4J) for what you need.
C++ has a less broad standard library, but large serious code bases like browser engines have their own replacements anyway that, among other things, reject the standard committee notion that C++ has exceptions.
I can already see stuff going into the next version of the C++ standard library where my advice as a domain expert has to be “don’t use it”. (C++ is going to put the IANA charset registry into the standard library.)
It’s sad to have a large standard library that you need to advise users of the language not to use.
So IMO C++ will never be a language for Python-like problems. Extremely basic things have the “wrong defaults”
(and IMO this not trivial – I remember when Go came out, it lost to Python on some benchmarks because the hash table wasn’t tuned)
So C++ is more like a language that’s good for writing libraries and infrastructure, including your own non-slow hash tables … probably the best language for writing hash tables! But ironically the default one is bad.
C++ has boost (www.boost.org)
It is a very high quality library, with a very high bar to add contributions into.
Header-only libraries is what I preferred to use where possible (boost was also the reason why I started using CMake back then, to simplify inclusion of specific boost libs)
I think apache-commons was Java’s extended standard library for a while … but I do not think anything in java ecosystem had the breadth and quality of boost.
I think this ties into my comment/theory from the other day. The bar is high, so you don’t have a http client in the standard lib. You contrast this with Python, where they’ve got an ok-ish library in the stdlib, it works for a lot of projects, especially ones that need independence. You have extremely popular third party libraries that improve either convenience (requests) or functionality (urllib3). And that’s not even getting into async - the language ships with async in its interpreter and its stdlib and it lacks an async http library in the stdlib entirely! The bar wasn’t as high there, they shipped something not great but workable and people still get a lot of use out of it.
I’m reminded of how, as a Python 2.x developer, I’d see things like the standard library having urllib and urllib2 and being told “Don’t use those. Use Requests… based on urllib3, which adamantly refuses to ever be stdlib’d”. There’s a reason the Python ecosystem has the sentiment, “the standard library is where packages go to die”.
This is a no brainer, imho, but the cost of making this happen is pretty huge. You can’t just make it, youll have to maintain it (including b/w compat) for years to come - that’s an expensive endeavour and it only highlights why Go is the only language with an exhaustive standard library: Google can cover the cost of development and maintenance. Not to mention BDFL leadership which in the Rust world is borderline impossible unless you want a new drama every other day.
What library do I use for time? time. If my app needed to deal with time zones, I’d use jiff. But since it doesn’t need to deal with time zones, it does not need to suffer the app size penalty of having a time zone database statically linked. libstd would be lesser if it required tzdata to be bundled, and it would be lesser if when I do need time zones, it took up (less) space by including a time library I couldn’t use.
The decision paralysis the author is hinting at, just plain doesn’t exist. The comment about web frameworks in particular is just silly. Different people have different preferences, different tasks have different needs, and libstd endorsing one blessed way doesn’t make life any easier for most people, just means libraries are less likely to support other ways.
You can, at any time, create your own ‘extended std’ crate which reexports a bunch of other crates. There are no actual problems that this crate actually solves, even if you name it stdx.
If someone goes down this extended standard library, I’d really like to see it apply what I really like from the node library: support levels. Make it easy to have things get pulled in at a certain stability level, so it can be experimental, get things fixed, and then finally reach a better level of stability (or remove or whatever).
As a rust user, I do often wish for a standard library of “correct, even if not performant” libraries (I really can’t handle regexes without pulling in a third party lib?). I really appreciate Python being so useful out of the box, and often just work on stuff with Python and Dash open, diving into the docs of the standard library to see how far I can go.
I really am not a fan of nightly/stable as the primary distinction for a lot of feature flag stuff in Rust. I understand it’s helpful as a consumer of a library to know that your specific version is relying on nightly stuff, and abstractly I get the value, but as a person who loves to upgrade stuff… doesn’t bug me that much.
I really can’t handle regexes without pulling in a third party lib?
The regex crate is a first-party lib: https://crates.io/crates/regex (look at the owners and repo). It was also created by burntsushi who is a member of the library API team.
Yeah you’re right, “third party” is not the right way of describing regex in particular. But why isn’t it in the standard library?
From the outside looking in, my impression is the standard lib is really trying to stick to doing things that have predictable performance characteristics and where there are not… I guess contentious decisions? Like for strings even (where you could argue not having short string optimizations makes them unsuitable for lots of use cases), at least there’s the consistent Vec memory layout. Am I missing something?
Yeah you’re right, “third party” is not the right way of describing regex in particular. But why isn’t it in the standard library?
API stability promises. Same as with the rand crate. With a first-party crate, you can continue to run 1.x on a new compiler or, assuming it doesn’t use shiny new features, use 2.x on an old one.
With the standard library, they update in lockstep and must not break backwards compatibility. That’s how you get things like C++‘s standard library regex support being slower than shelling out to PHP. (It’s also why we’ll forever have two deprecated methods on the Error trait.)
The general rule for the Rust standard library is that it’s for:
Interfaces and implementations referenced by language constructs or other standard library code (eg. Result, Option, the machinery that async/await depend on under the hood, etc.)
Stuff that’s considered so universal and felt to be such a solved problem that it’s merited. (See the recent completion of getting once_cell merged into the standard library because every project of non-trivial size contains either once_cell or the older lazy_static or, more likely… both.)
A few things that they didn’t get around to evicting before the v1.0 feature freeze, like std::collections::LinkedList.
That’s also why http isn’t part of the standard library. It’s not necessary for it to be there to provide a standard set of types for the ecosystem to interoperate via.
This is why I really like Go, it comes with batteries included. I can go a very long way with Go’s standard library before I need/want to use third-party dependencies. From a security perspective, this is a great win!
The value of crates.io is that there’s no curation, so anyone can upload their base64 decoder or x509 certificate parser and have them hosted at no charge on reliable infrastructure, with auto-generated documentation.
The problem with crates.io is that there’s no curation and also no namespacing, so https://crates.io/crates/base64 and https://crates.io/crates/x509 are version v0.x and uploaded by some blokes in sheds.
https://crates.io/crates/sha256 and https://crates.io/crates/sha2 both exist; which one do you think is the “official” crate for SHA-256? If either?
Putting a random grab-bag of libraries into a big source control repository and calling it
stdx
isn’t something that’s going to help, because conglomerated mega-libraries have existed for longer than TCP/IP and people still can’t agree on which regex syntaxlibc
should implement. I don’t want to go back to the idioms of C programming when libraries likeglib
provide implementations of base64 and XML and command-line flag parsing and thread pools. There’s nothing wrong with small libraries, it’s the organization (/ categorization) that’s the problem.If the author wants something like a Rust equivalent of Go’s standard library then it seems unlikely to ever happen, because the Rust core devs don’t want to be in the business of writing a
tar
codec. If they want something like thegolang.org/x/
packages, then that’s a problem with crates.io rather than Rust stdlib – there is no technical reason we can’t have an officially-maintained SHA-256 implementation in arust-lang.org/x/crypto/sha256
package.The time has come for the author of this post to learn that chasing the newest, most Discord-approved library is a futile and burnout-inducing endeavor.
Don’t worry about which libraries are newest. libpng has been compressing lossless cat pictures for almost 30 years and it’s still a reasonable default choice if you want to work with PNGs.
Don’t worry about social proof or popularity when choosing a library. The only vote that matters is the computer’s. If you’re worried that a library you depend on might vanish from the internet in 10 years, then make a copy of it. If you’re worried it will become unpopular, then stop worrying about that.
If you care about being “best” then you’re always chasing a moving target. Frontend is full of moving targets because many people publishing to NPM care about how many downloads their packages get, so they advocate for people to use their libraries by claiming to be the “best” at manipulating the DOM or whatever. Do you think people writing firmware spare any time to wondering if their i2c library is “the best”?
Try to be the Rust developer who comes over from C and still writes documentation in nroff because they don’t see any reason not to. Don’t be the Rust developer who comes over from JavaScript and tries to “follow the latest news” about crates.io libraries as if it’s some sort of sports league.
I’ll second the namespacing issue. I really don’t understand cargo’s reasoning for going with a flat namespace here. At least you can use your own repositories and git repos as crates if you want, though the experience is strictly worse than how Go handles it.
Maybe it’s time for an alternative cargo ecosystem.
That being said, Go also has a massive company backing it and is rather precise in what they include or change in their stdlib and extended stdlib.
FWIW: A namespace RFC has been proposed and passed for 2 years: https://github.com/rust-lang/rfcs/blob/c8a517188452e80b85b192f39de4ec0351afd264/text/3243-packages-as-optional-namespaces.md.
It just needs to be implemented. There’s very few people who care enough and the cargo team is thin-stretched. A second ecosystem will not solve this. It’s not like cargo doesn’t accept contribution, it’s that no one is bringing the contribution.
That’s not an RFC for namespaces, it’s an RFC for package name prefixes.
Why should someone who wants namespaces contribute by implementing a feature that they don’t want, and is a poor substitute for the feature they do want?
Namespaces won’t solve it, because they tell you nothing about security or quality of the packages.
You want think they do, because you expect to be able to infer that from the owner, but this has lots of pitfalls:
Projects can change ownership without changing the namespace! This happens on GitHub and npm all the time. “JiaTan”s are given access to the existing project, and it isn’t moved to another namespace, because that’s disruptive to users.
Reputable official namespaces can contain obsolete garbage (batteries included always end up leaky). This is already the case with some rust-lang-owned crates.
The official namespace does not give the same security guarantees to everything under it. I’ve explicitly asked Rust project members about this – development practices vary from project to project, and some are just pet projects someone happens to own under rust-lang.
I’ve also been told off by Debian maintainers for insinuating that packages included in Debian had their code reviewed for supply-chain-security issues. They check licenses, and ensure the code works, but they can’t promise checking for backdoors (that’s entirely reasonable, because security-focused code review is painfully laborious and difficult).
In a large ecosystem you will end up having to pull deps from many namespaces, and you will not know how reputable they are. There are top authors with silly creatively spelled nicknames, and you’re at risk of not noticing typosquatting in the silly names. There are namespaces that sound like owned by well-known trillion-dollar company that happen to be owned by an independent developer who just grabbed a name. There are namespaces that look like from a serious company with headshots of board of directors on their site, that are from an “entrepreneur” coding from mom’s basement.
In the end it all needs to be reviewed and vetted. You don’t have better chances eyeballing namespace names alone.
Namespaces will solve it, because they are a pre-requisite for writing policies about security or quality of packages, which is a non-technical process done by end-users. I don’t need the crates.io website to tell me if a library is secure or well-written.
Doesn’t matter what happens on GitHub or NPM. Whether a given name can change ownership has nothing to do with whether that name is allowed to co-exist with other names.
I can trust that kernel.org serves official Linux kernel source archives, I can trust that gnu.org is a reliable source for GNU Make source archives, I can trust that rust-lang.org is a reliable source for Rust toolchain binaries.
Even today, on crates.io, a crate is allowed to be transferred to other people – this is in fact one of the arguments that people have used against namespaces (because transferring crate ownership would no longer automatically bring along existing users).
Doesn’t matter. I’m not using the namespace to determine the quality of code, that’s what dependency code review is for.
Doesn’t matter – see above. Namespaces aren’t a trust mechanism, they’re a publishing mechanism.
I’m not expecting any packages on crates.io to have been reviewed for malicious code. It’s a package registry, its job is to host source code and provide name=>tarball lookup.
This is a matter of user policy. It’s not up to the crates.io team to determine whether any given package is trustworthy, that’s up to dependency review.
i really don’t understand what you’re trying to say here, because on the surface this statement seems blatantly false? if you want to write a policy that you only trust crates published by certain organizations, you can use the crates.io ownership fields to enforce that.
i’m also not sure what you mean by “non-technical”, security audits of dependencies seem pretty technical to me?
The crates.io ownership field is used for configuring permissions of crates.io packages, it doesn’t actually have any thing to say about who the owners (/ maintainers, developers, etc) of a library are.
rust-lang/libc
under my own account under a false name, that doesn’t make me the owner of https://github.com/rust-lang/libcRemember, crates.io is just a package registry. It doesn’t have anything to do with security or code quality.
Say you work at a small company with ~500 engineers or so. The CTO (been a manager for 20 years) might write a policy describing which third-party dependencies are allowed, the General Counsel will add a section about license compliance, and that policy will be provided as evidence to a SOC2 auditor (usually an accountant).
The policy will have rules about which level of review is required for different cases, how the results of that review are recorded, how much of the code will be mirrored internally and in what format, etc.
The policy might have an impact on how software engineers do their jobs, but the policy itself is not a technical document. It’s unlikely to have anything Rust-specific in it other than rules about which hosting services are OK (“GitHub and crates.io are OK, no personal homepages”), and it might have been written by people who have never written (or read!) any Rust code.
You’ve mentioned working for Cloudflare, and that you “have people actively trying to exploit us”.
If you’re allowed to say, does Cloudflare review and vet their dependencies’ code? If so, do you think they would be willing to join Google, Mozilla, et al. in publishing those reviews for
cargo-vet
(or similar), so that the great labour of reviewing all dependencies can be shared by more well-resourced parties?(Of course some would say that one shouldn’t trust even Google or Cloudflare to review one’s dependencies for one, but I imagine
cargo-vet
could grow an option to require some N > 1 third-party reviews to accept a crate that lacks one’s own review.)We do review. We’re working on improving our tooling around this, and we’ll probably share the results and the tools when we have that ready.
I agree, but this mindset only works to a point unless your dependencies are independent and self-contained, which is rare in Rust. Most often each library carries its own little (or big) tree of transitive dependencies and those trees overlap each other. You might have picked two dependencies and wound up shipping three “error” crates, only one of which is maintained any more because your deps and their deps are out there chasing the ideal and this churn gets caught up in updates for security reasons. For programs of any serious size it becomes your problem unfortunately.
While I broadly agree - the constant churn of best practices is exhausting and IMO an extended standard library would be wonderful - I’m not sure this article is bringing a lot to the discussion, particularly the technical aspects of bringing so many libraries into alignment or the politics of choosing. The reference to Adam Harvey’s investigation is misleading. From the referenced report:
I think it’s just part of Rust’s culture at this point that libraries will be written and rewritten and people will pull whatever they like from crates.io’s immense immutable history. Give it another 10 years and I’m optimistic things will settle down though. (I’m not being flippant, I think this will be a wonderful period for Rust.) In the meantime one has to be nimble.
P.S. Curiously this loads fine on mobile but returns Forbidden on desktop?
Fully agree. I feel like the churn of best practice in Rust is a symptom, not the problem. The “problem” being that designing a cohesive, stable, batteries-included standard library is hard: off the top of my head, scanf in C and Go’s net.IP are some good examples where a standard library can under-deliver, leaving a permanent wart. And in Rust specifically, I think passing by &T/&mut T/T, object safety,
Send
/Sync
, and async also add complexity when trying to figure out how to design an interface. “Churn” is a sign that the interface wasn’t as obvious after all!As someone who writes Rust both for fun and for $dayjob, I like having small, easily swappable components. It’s definitely annoying when there’s ecosystem churn, but it’s not like my existing dependencies will break (oh, and Cargo even lets you import multiple versions of the same dep, so you don’t even need to upgrade everything at once anyway)
notably, Go made huge strides with the introduction of net/netip
I don’t know Zig well, but calling out its standard lib as more extended than Rust’s seems ingenuous.
The whole article reeks of “if only someone else would solve these problems for me!”. Go ahead, join the libs team. Go write some proposals for the Rust Foundation to fund it. Looking at this blog’s previous articles, nope, they aren’t contributing anything or doing any experiments to solve it, they’ve just been complaining about the same problems for about a year.
… is that an option? I’ve given up on writing ACPs and RFCs because of the extremely long delay before they get reviewed, and I figured libs-team is way undersized (6 people, compared to 15 on compiler-team) because that’s how the existing members prefer it.
It was an option like 6 years ago when I last considered it? Dunno how much the practical mechanics have changed since then. Their zulip seems open to all, and their weekly meetings appear to be as well. My experience with the Rust project in general is that if you participate regularly, ask useful questions, and come up with interesting ideas, people will start to recognize you and pay attention.
yeah, rustc meetings are like a lot like city council meetings in that you can just show up to them. in my experience the only things you get from a membership to a team is the ability to review PRs and vote on certain things.
What does the zig stdlib documentation link mean here? Seems to support more utilities than rust’s stdlib at face value (check the namespaces).
it’s also worth noting zig doesn’t have a standard package manager
Zig has support for fetching and managing packages. Look into zon files and ’zig fetch`.
It increasingly does!
Nope, you’re right, my bad. I didn’t look closely enough.
I’ll just leave this RustConf talk video by Josh Triplett here
This nails home the point I wanted to comment: The rust ecosystem was and still is changing by a ton. And whatever you decide to integrate now into std will be the wrong decision in 2 years. Either because the language will allow for easier ways to express the same functionality (async/await, NLLs, const generics) or because the best solution for the problem changed (see the talks first few minutes about error crates, and I experienced this myself - including error-chain).
Now some projects might settle down and ecosystems might slow down, to the point they are stable. Something I would say about tokio (beware of the in-flight
io_uring
..). Which is good. But at that point you might as well use those stable libraries - why even bother putting them in the std. Especially since my example of tokio might be totally wrong - we already had people talk about different runtime requirements than what tokio offers - and we have definitely other runtimes available, which some people do prefer (smol, monoio, async-std to name a few).And if you say “tokio is the default, put it into the std”: You can always pin tokio 1.0 in your project (or even a specific hash) and stop worrying. Yes - you can.
I think Sylvain has a point: With each crate in your dependency graph, you need to either trust the author or audit the code. However, this also applies to the standard library, so moving code there doesn’t magically fix the supply chain problem. Same with namespacing, which is also touted as a solution every time this comes up, as if adding an org identifier and allowing for more potential typosquatting would somehow improve supply chain safety.
The solution is doing the work, not hoping that the understaffed and overworked libs team volunteers do it. Besides, the foundation is already working with JFrog to audit crates.io for malicious code. Yet every time the topic comes up, we get the same rallying cries for either crate namespacing, a bigger standard library or both.
I have thought about this reply for way too long. I do understand the higher trust level that the rust org has, as well as the idea of keeping the amount of organizations you need to trust low. (rust org + tokio org makes two already, going with tokio as example). Nevertheless I think it wouldn’t make much of a difference if the folks maintaining tokio would do the same under the rust org, especially when keeping the current development pace.
Thus my feeling is that this is more about the feeling of trust and that such a move wouldn’t really gain any higher security. And if you would lock down the amount of people that can contribute, you would probably starve the project and elevate the burden on the rust org even more.
There is also the aspect of whether the tokio project would be slowed down massively by being just one part of the overall rust org - which obviously has its main priority around the rust compiler and language, overshadowing the tokio project.
The problem is that my argument only holds true for tokio as example. If you count the 300+ crates a typical webserver has, then yeah. You probably would gain something by putting this all under - say - the tokio org, which already maintains axum. Then again, huge parts of those 300 crates are probably from the same area of contributors (axum, http, hyper, tower, tokio, futures)
I’ve computed one:
https://lib.rs/std
These are top crates ranked by
popularity * quality
(where quality is a mix of lots of ranking signals computed from crates’s source, usage history, and pagerank-like author trust).I think a lot of this is fear and confusion is caused by search on crates.io being… to put it nicely, unopinionated. It doesn’t detect obsolete crates. It doesn’t rank by author reputation. It will give you results by exact name match, even if that’s a v0.0.1 hello world package published 7 years ago.
Compare:
https://lib.rs/?http+client vs https://crates.io/search?q=http+client
or
https://lib.rs/?web+framework vs https://crates.io/search?q=web+framework
The problem with opinionation is, of course, the opinions. I wish I could recommend lib.rs for more things, but trying to find crates on lib.rs related to my day job involves occasionally seeing a banner with an inaccurate and political screed against my choice of technology, and incremental improvements to crates.io’s search algo seems better than endorsing that.
Yes, that’s entirely on purpose. I want cryptobros to think my site sucks and not use it.
This is kind of an unfair comparison. Go has a group of engineers that decide what’s in stdlib who are full-time, long-tenured, and fearlessly assertive, as well as (IMO) smart, experienced, and tasteful. That is an extremely expensive and challenging resource to create and maintain (as OP points out). It also has a fairly narrow intended “sweet spot” of network server infrastructure that makes it easier to decide what belongs in the stdlib. Even if Rust somehow got the first thing, it wouldn’t have the second.
The trouble is you can’t really do this halfway, or the library crates won’t interoperate. I’m reminded of the nightmarish time when you couldn’t load multiple C# libraries in the same process because they would disagree on what
Object
is. At some point you end up having to draw boundaries (process or otherwise) to be able to use multiple different library “cultures” in one project. (Hm, maybe a “culture” of interoperating libraries could be an explicit thing in crates.)stdx
exists.As does https://blessed.rs/crates which is a hand curated list of libraries for common tasks.
Blessed looks great! I’ve used Rust for a long while but hadn’t come across that before
Man oh man, that would have saved a lot of time.
Sweet sweet abandonware. Truly stable
Yep. You also have
near-stdx
,awesome-rust
, and probably a bunch more that I couldn’t find in various states of abandonment.Turns out that curating a stdlib is a lot of work. Who knew!
For more context:
This is responding to:
The real
stdx
was a project by Brian Anderson (brson), who was the (de facto?) leader of the Rust Project for some years after Graydon Hoare left.https://lib.rs/std is an up-to-date set.
Why? It’s just a blog post
Completely agree. Recently I wrote a small web service in Go, and the only dependency I have is the sql driver.
And database/sql isn’t an ideal abstraction. So that’s wrong
And yet I’m pretty confident that this codebase will continue to work correctly with any
database/sql
version which will be released in the next 10 years.This is not something I would be willing to say about any of the Rust db access libraries. At least this has been my experience working professionally in Rust for the last 4 years.
Most things in the Go stdlib are not the ideal abstraction but they work fine for the 90% use case.
I’m curious what kind of experience folks who want this have with Java or C++.
Java has a broad standard library, but arguably you aren’t supposed to use it but instead you are supposed to find an Apache Commons library (or, in the case of i18n, ICU4J) for what you need.
C++ has a less broad standard library, but large serious code bases like browser engines have their own replacements anyway that, among other things, reject the standard committee notion that C++ has exceptions.
I can already see stuff going into the next version of the C++ standard library where my advice as a domain expert has to be “don’t use it”. (C++ is going to put the IANA charset registry into the standard library.)
It’s sad to have a large standard library that you need to advise users of the language not to use.
The tragedy of C++ …
there was no hash table in C++98, only the ordered
map<>
aka red-black tree. I remember in mid 2000’s people lamenting this.The hash table we got in C++ 11 is atrociously slow because the API is overly general - it does a ton of allocations – https://www.oilshell.org/blog/2022/10/garbage-collector.html#unordered_setvoidinsert-slower-than-malloc1
So IMO C++ will never be a language for Python-like problems. Extremely basic things have the “wrong defaults”
(and IMO this not trivial – I remember when Go came out, it lost to Python on some benchmarks because the hash table wasn’t tuned)
So C++ is more like a language that’s good for writing libraries and infrastructure, including your own non-slow hash tables … probably the best language for writing hash tables! But ironically the default one is bad.
C++ has boost (www.boost.org) It is a very high quality library, with a very high bar to add contributions into.
Header-only libraries is what I preferred to use where possible (boost was also the reason why I started using CMake back then, to simplify inclusion of specific boost libs)
I think apache-commons was Java’s extended standard library for a while … but I do not think anything in java ecosystem had the breadth and quality of boost.
I think this ties into my comment/theory from the other day. The bar is high, so you don’t have a http client in the standard lib. You contrast this with Python, where they’ve got an ok-ish library in the stdlib, it works for a lot of projects, especially ones that need independence. You have extremely popular third party libraries that improve either convenience (requests) or functionality (urllib3). And that’s not even getting into async - the language ships with async in its interpreter and its stdlib and it lacks an async http library in the stdlib entirely! The bar wasn’t as high there, they shipped something not great but workable and people still get a lot of use out of it.
Rust’s design was directly inspired by languages like Python and sentiments like Batteries Included, But They’re Leaking.
I’m reminded of how, as a Python 2.x developer, I’d see things like the standard library having
urllib
andurllib2
and being told “Don’t use those. Use Requests… based onurllib3
, which adamantly refuses to ever be stdlib’d”. There’s a reason the Python ecosystem has the sentiment, “the standard library is where packages go to die”.This is a no brainer, imho, but the cost of making this happen is pretty huge. You can’t just make it, youll have to maintain it (including b/w compat) for years to come - that’s an expensive endeavour and it only highlights why Go is the only language with an exhaustive standard library: Google can cover the cost of development and maintenance. Not to mention BDFL leadership which in the Rust world is borderline impossible unless you want a new drama every other day.
What library do I use for time?
time
. If my app needed to deal with time zones, I’d usejiff
. But since it doesn’t need to deal with time zones, it does not need to suffer the app size penalty of having a time zone database statically linked. libstd would be lesser if it required tzdata to be bundled, and it would be lesser if when I do need time zones, it took up (less) space by including a time library I couldn’t use.The decision paralysis the author is hinting at, just plain doesn’t exist. The comment about web frameworks in particular is just silly. Different people have different preferences, different tasks have different needs, and libstd endorsing one blessed way doesn’t make life any easier for most people, just means libraries are less likely to support other ways.
You can, at any time, create your own ‘extended std’ crate which reexports a bunch of other crates. There are no actual problems that this crate actually solves, even if you name it stdx.
If someone goes down this extended standard library, I’d really like to see it apply what I really like from the node library: support levels. Make it easy to have things get pulled in at a certain stability level, so it can be experimental, get things fixed, and then finally reach a better level of stability (or remove or whatever).
As a rust user, I do often wish for a standard library of “correct, even if not performant” libraries (I really can’t handle regexes without pulling in a third party lib?). I really appreciate Python being so useful out of the box, and often just work on stuff with Python and Dash open, diving into the docs of the standard library to see how far I can go.
I really am not a fan of
nightly
/stable
as the primary distinction for a lot of feature flag stuff in Rust. I understand it’s helpful as a consumer of a library to know that your specific version is relying onnightly
stuff, and abstractly I get the value, but as a person who loves to upgrade stuff… doesn’t bug me that much.The regex crate is a first-party lib: https://crates.io/crates/regex (look at the owners and repo). It was also created by burntsushi who is a member of the library API team.
Yeah you’re right, “third party” is not the right way of describing
regex
in particular. But why isn’t it in the standard library?From the outside looking in, my impression is the standard lib is really trying to stick to doing things that have predictable performance characteristics and where there are not… I guess contentious decisions? Like for strings even (where you could argue not having short string optimizations makes them unsuitable for lots of use cases), at least there’s the consistent
Vec
memory layout. Am I missing something?API stability promises. Same as with the
rand
crate. With a first-party crate, you can continue to run 1.x on a new compiler or, assuming it doesn’t use shiny new features, use 2.x on an old one.With the standard library, they update in lockstep and must not break backwards compatibility. That’s how you get things like C++‘s standard library regex support being slower than shelling out to PHP. (It’s also why we’ll forever have two deprecated methods on the
Error
trait.)The general rule for the Rust standard library is that it’s for:
Result
,Option
, the machinery thatasync
/await
depend on under the hood, etc.)once_cell
merged into the standard library because every project of non-trivial size contains eitheronce_cell
or the olderlazy_static
or, more likely… both.)std::collections::LinkedList
.That’s also why
http
isn’t part of the standard library. It’s not necessary for it to be there to provide a standard set of types for the ecosystem to interoperate via.I get a 403 Forbidden and the archives seem to not work because of the cloudflare capture.
This is why I really like Go, it comes with batteries included. I can go a very long way with Go’s standard library before I need/want to use third-party dependencies. From a security perspective, this is a great win!