This is a very confusing proposal and problem. When I write go, I’m more concerned I have to rely on checked casts just to know if the err I have is what I think it is and I can never guarantee I have an exhaustive set of branches for every possible (possibly even nested and wrapped) error. I don’t see how sugaring over:
err := something()
if err != nil {
...
}
really solves the biggest problem I see and I can’t imagine if err != nil is a struggle for many.
Even ignoring the public Internet, privately, I still see people picking random subnets and being surprised that all of a sudden, a whole work center can’t access the service. Being able to just pick a random ULA is basically a superpower, in it of itself or just having everyone pick from their /64 subnet they’re on.
CeyptoJS, weird “word size” measurements and more? This sounds about right. A lot of the nonfree software I encounter at work use a combination of these sorts of tactics and they’re all very trivially defeated with some knowledge of encryption, browser js debugger and a will to power through.
This one is particularly nasty, but ironically, just as Amtrak left comments, I’ve found some of these products leak other, unintended data that help understand code. I wonder if they leak it intentionally.
after programming Go for a long time I write all my code in Rust now, and now that I have Result types and the ? operator, my code is consistently shorter and nicer to look at and harder to debug when things go wrong. I dunno. I get that people think that typing fewer characters matters. I just don’t think that. When I program Rust I miss fmt.Errorf and errors.Is quite a lot. Yeah, sure, you can use anyhow and thiserror but still, it’s not the same. You start out with some functionality in an app and so you use anyhow and anyhow::Context liberally and do some nice error message wrapping, but inspecting a value to see if it matches some condition is annoying. Now you keep going and your project matures and you put that code into a crate. Whoops, anyhow inside of a crate is rude, so you switch the error handling strategy up to thiserror to make it more easily consumed by other people. Now you’ve got to redo a bunch of error propagating things maybe you’ve got some boilerplate enum stuff to write, whatever; you can’t just move the functionality into a library and call it a day.
Take a random error in Rust and ask “does any error in this sequence match some condition” and it’s a lot more work than doing the equivalent in Go. There are things that make me happy to be programming Rust now, but the degree to which people dislike Go’s error system has always seemed very odd to me, I’ve always really liked it. Invariably, people who complain about Go’s error system consistently sprinkle this everywhere:
if err != nil {
return nil, err
}
Which is … bad. This form:
if err != nil {
return nil, fmt.Errorf("some extra info: %w", err)
}
goes a long way, and errors.Is and errors.As compose with it so nicely that I don’t think I’ve seen an error system that I like as much as Go’s. Sure, I’ve seen lots of error systems that have fewer characters and take less ceremony, I just don’t think that’s particularly important. And I’m sure someone is going to come along and tell me I’m doing it wrong or I’m stupid or whatever, and I just don’t care any more. I quite like the ergonomics of error handling in Go and miss them when I’m using other languages.
Isn’t wrapping the errors with contextual information at each function that returns them just equivalent to manually constructing a traceback?
The problem of errors in rust and Haskell lacking information comes from syntax that makes it easy to return them without adding that contextual information.
Seems to me all three chafe from not having (or preferring) exceptions.
Isn’t wrapping the errors with contextual information at each function that returns them just equivalent to manually constructing a traceback?
no, because a traceback is a map of how to navigate the code, and it requires having the code to be able to interpret it, whereas an error chain is constructed from the vantage point that the error chain alone should contain all of the information needed to understand the fault. An error chain can be understood by an operator who lacks access to the source code or lacks familiarity with the language the program is written in; you can’t say the same thing about a stack trace.
The problem of errors in rust and Haskell lacking information comes from syntax that makes it easy to return them without adding that contextual information.
not sure about Haskell but Rust doesn’t really have an error type, it has Result. Result is not an error type, it is an enum with an error variant, and the value in the error variant may be of any type. There’s the std::error::Error trait, but it’s not used directly; a Result<T, E> may be defined by some E that implements std::error::Error or not, but it’s inconsistent. It may or may not have some kind of ErrorKind situation like std::io does to help you classify different types of related errors, it might not.
Go’s error type is only comparable to Rust’s Result type if you are confused about one language or the other. Go’s error type is less like Result and more like Vec<Box<dyn std::error::Error>>, because the error type in Go is an Interface and it defines a set of unwrap semantics that lets you unwrap an error to the error that caused it. (it’s actually more like a node in a tree of errors these days so even that is a simplification).
Unlike a trait, a Go interface value is sized, it’s essentially a two-tuple with a pointer to the memory of the value and another pointer to a vtable. Every error in the Go standard library and third party ecosystem is always a thing with a fixed memory size and at least one layer of indirection. Go’s error system is more like if the entirety of Rust’s ecosystem used Result<T> to mean Result<T, Box<dyn std::error::Error>>. Since that’s not universal, even if you want to do it in your own projects, you run into friction when combining with projects that don’t follow that convention. Even that is an oversimplification, because downcasting a trait object in Rust is much more fiddly than type switching on an interface value in Go.
There are a lot of subtle differences between the two systems that turn out to cause enormous differences in the quality of error handling across the ecosystems. Getting a whole team of developers to have clean error handling practices in Go is straightforward; you could explain the practices to an intermediate programmer in a single sitting and they’d basically never forget them and do it correctly forever. It’s significantly more challenging to get that level of consistency and quality with error propagation, reporting, and handling in Rust.
a traceback is a map of how to navigate the code, and it requires having the code to be able to interpret it, whereas an error chain is constructed from the vantage point that the error chain alone should contain all of the information needed to understand the fault
Extremely interesting. You’re both right: it is manually constructing (a projection of) a continuation, but the nifty point scraps makes is that the error chain is a domain-specific representation of the continuation while the traceback is implementation-specific. Super cool. This connects to a line of thinking rooted in [1] that I’ve been mulling over for ages and would love to follow up on.
Here’s a “fantasy abstract” I wrote years ago on this topic:
Felleisen et al[1] showed that an abstract algebraic treatment of evaluation contexts was possible, interesting and useful. However, algebras capturing implementation-level control structure are unsuitable for reflective use, precisely because they capture an implementation- rather than a specification-level description of the evaluation context. We introduce a language for arbitrary specification-level algebras describing high-level perspectives on program tasks, and a mechanism for ensuring that implementation-level control context refines specification-level control context. We then allow programs to not only capture and reinstate but to analyse and synthesise these structures at runtime. Tuning of the specification algebra allows precise control over the level of detail exposed in reflective access, and has practical benefits for orthogonal persistence, for code upgrade, for network mobility, and for program execution visualisation and debugging, as well as offering the promise of new approaches to program verification.
[1] Felleisen, Matthias, et al. “Abstract Continuations: A Mathematical Semantics for Handling Full Functional Jumps.” ACM Conf. on LISP and Functional Programming, 1988, pp. 52–62.
creating a stack trace isn’t free. It’s fair to assume creating a stack trace will require a heap allocation. Performing a series of possibly unbounded heap allocations is a meaningful tax; whether that cost is affordable or not is highly context-dependent. This is one of the reasons why there is a whole category of people who use C++ that consciously avoid using C++ exceptions. One of the benefits of Rust’s system and the Result type is that the Result values themselves may be stack allocated. E.g., if you were inclined to do so, you could define a function that returns Result<(), u8>, and that’s just a single stack-allocated value, not an unbounded series of heap allocations. In Go, you could write a function that returns an error value, which is an interface, and that error value could be some type whose memory representation is a uint8. That would be either a single heap allocation or it would be stack-allocated; I’m not sure offhand, but regardless, it’s not an unbounded series of heap allocations. The performance implications are pretty far-reaching. Those costs are likely irrelevant in the case of something like an i/o bounded http server, but in something like a game engine or when doing embedded programming, they matter.
The value proposition of stack traces is also different when you have many concurrently running stacks, as is typical in a Go program. Do you want just the current goroutine’s stack, or do you want the stacks of all running goroutines? goroutines are also not hierarchical in the scheduler, so “I want the stack of this goroutine and all of its parents” is not a viable option, because the scheduler doesn’t track which goroutines spawned which other goroutine (plus you could have goroutine A spawn goroutine B that spawns goroutine C and goroutine B is long gone and then C hits an error, which stacks do you want a trace of? B is gone, how are you gonna stack trace a stack that no longer exists?). If you want all the stacks, now you’re doing … what, pausing all goroutines every time you create an error value? That would be a very large tax.
Overall, the original post is about the ergonomics of error handling; I think the ergonomics of Go’s system are indeed verbose, but my experience has been that the design of Go’s error handling system makes it reasonably straightforward to produce systems that are relatively easy to debug when things break, so generally speaking I think the tradeoff is worthwhile. Sure, you can just yeet errors up the chain, but Go doesn’t make that significantly more ergonomic than wrapping the error. I think where the original post misses the point by a wide margin is that it just yeets errors up the chain, which makes them hard to debug. Making the code shorter but the errors less-useful for debugging is not a worthwhile tradeoff, and is a misunderstanding of the design of Go’s error system.
Creating a stack trace means calling both runtime.Callers (to get the initial slice of program counter uintptrs) and runtime.CallersFrames (to transform those uintptrs to meaningful frame data). Those calls are not cheap, and (I’m not 100% confident but I’m pretty sure that) at least CallersFrames will allocate.
To create a stacktrace up from your current location, sure. But the unwind code you’re emitting knows what its actual location is, so as the error is handed off through each function, each site can just push its pc or even just a location string onto the backtrace list in the error value. And you can gather the other half of the stacktrace when you actually go to print it (if you even want it), which means until then it’s basically free: one compare, one load, two store or thereabouts. And hence, errors that aren’t logged don’t cost you anything.
I think this is the point you may have missed further up asking about the analogy to exceptions and tracebacks. The usual argument I see in favor of something like Go’s error “handling” is that it avoids the allocations associated with exceptions and tracebacks. But then the standard advice for writing Go is that basically every level of the stack ought to be using error-wrapping tools that likely allocate anyway. So it’s a sort of false economy – in the end, anyone who’s programming Go the “right”/recommended way is paying a resource and performance price comparable to exceptions and tracebacks.
The thing that bit me recently was that Go uses the pattern of returning an error and a value so much that it is now encoded in tooling. One of the linters throws an error if you check that an error is not nil, but then return a valid result and no error. Unfortunately, there is nothing in the type system to say ‘I have handled this error’ and so this raised a false positive that can be addressed only with an annotation where you have code of the form ‘try this, if it raises an error then do this instead’. I don’t know how you’re supposed to do that in idiomatic Go.
basically every person I work with thinks it’s an issue, because we write production Rust code in a team setting and we’re on call for the stuff we write. A lot of the dialogue about Rust here and elsewhere on the web is driven by people that aren’t using Rust in a production/team setting, they’re using it primarily for side projects or solo projects or research or are looking at it from a PL theory standpoint; these problems don’t come up that much in those settings. And that’s not to dismiss those settings; I personally do my side projects in Rust instead of Go right now because I find writing Rust really fun and satisfying, and this problem doesn’t bother me much on my personal projects. Where it really comes up is when you’re on call and you get paged and you’re trying to understand the nature of an error in a system that you didn’t write, while it is on fire.
Yeah okay but mean time people are writing dozens of blog posts about how async rust is broken (which really I don’t think it is) and this issue barely gets any attention.
How exactly is thiserror and pattern matching on an error variant not strictly superior, if not, at worst, the exact same as fmt.Errorf and errors.Is? The whole point of thiserror is allowing you to write format strings for the error variants you return or are wrapping.
errors.Is is not simply an equality check, it recursively unwraps the error, finding if any error in the chain (or tree, if an unwrap returns a slice) matches, so you can always safely add a layer of wrapping without affecting any errors.Is checks on that error. https://pkg.go.dev/errors#Is
I wonder whether Rust’s not having this is due to Rust’s choice to have a more minimal standard library or simply to no one’s having proposed it yet. I think something like this would be very simple to implement in the Rust standard library (and only slightly more verbose to implement outside it).
The tradeoff here is that it is breaking the abstraction of a function. For example changing underlying http library may change the error type. Your options are to break the error chain and the implementation-specific diagnostics or expose that to callers.
Rust cares a lot about being explicit a out API breakage so it doesn’t surprise me that this isn’t a part of the standard library.
It definitely is convenient for within your own code but across stable API boundaries it is a much worse tradeoff.
Looks like, if serde-rs/serde#2597 doesn’t get merged quickly and another suitable compromise can’t be reached, I’ll be pinning serde = "<=1.0.171" in my Cargo.toml (like time has) until rust-lang/compiler-team#475 is implemented.
Crazy that time pinned it, that’s a hugely widespread library. I wonder what percentage of Rust code will inherit the pin from time, definitely double digits.
*nod* Which reminds me. Given that not all my projects use Serde, but dependencies might, I should also add a [bans] entry to my cargo-denydeny.toml files.
[[bans.features]] isn’t perfect though. If serde-rs/serde#2597 gets merged, I’ll have to see if they’d be up for adding a features.require key so I don’t have to babysit a features.exact.
… and now the use of precompiled proc macros has been reverted half-way through my efforts to push out said change to all my repos… luckily, I got bored at the half-way point and was working on a “script” (i.e. really a sloppily-written Rust program) which I can just complete in a different direction instead to automate the bumping of Cargo.toml to serde = "1.0.184" and the change of the deny.toml ban to >=1.0.172, <1.0.184.
…and probably lean into being nerd-sniped into cleaning it up as a reusable tool. I’ve come this far, so I might as well finish its transformation into something I can use next time I need to easily batch-bump Cargo.toml and deny.toml across all my repos for reasons Dependabot doesn’t cover.
I don’t want to use this as tech support, but how exactly is Cargo.toml and the lockfile supposed to work then? I did exactly what you specified, but cargo tree still shows [email protected], and cargo update --precise [email protected] doesn’t let me specify .
The syntax time uses in Cargo.toml is version = ">= 1.0.126, <= 1.0.171" and Cargo’s resolver won’t behave the way you want if you don’t also specify a lower bound. (In my case, specifying only an upper bound “upgraded” the project in question from one unified version in the 1.0.1xx range to 0.9.15and1.0.183… probably because it settled on 0.9.15 for some reason and that’s too low a version to satisfy one of my other dependencies at the version I specified)
That’s why i’m also going to use cargo-deny to warn me when I need to replace or pin another dependency to avoid having it transparently start pulling in a second copy of a newer Serde because a dependency bumped its minimum.
As for cargo update, --precise augments --package. It doesn’t replace it.
cargo update --package serde --precise 1.0.171
…or cargo update -p serde --precise 1.0.171 for short.
Ironically, the times I actually needed (or would be greatly helped) by kermit, I never had it on the remote. If you’re exceedingly crafty, you can usually do a lot of good (or evil) with Tcl expect module. I’d write shell snippets to set echo off, raw mode, and spawning base64 -d > rx and other such things and it generally worked ok. Sometimes you can manage to really write fast to the serial port and overrun the buffer on the receiving end so I usually set the baud rate in the lower end. It probably explains why some of these serial port tools had some form of CRC or error correction.
I don’t think the the price will be high enough for companies to move all that soon.
I do however think that with IPv6 being old, being required by various institutions (US military if I remember) and how IPv4 is used these days being through a ton of a hacks it might be worthwhile to consider “shaming” on everything that either calls themselves internet related (ISPs, Cloud Providers) or modern, cutting edge, etc. (from Apple to GitHub) basically.
In my humble opinion it’s also a question of trust. Do I really want to trust the software or service that doesn’t manage to have proper IPv6 support for decades? Even if it wasn’t their first priority. There were events like IPv6 Launch Day. Everyone with at least some pride over how they run their network should at least enable it. It’s not rocket science and many individuals and organizations support it. About half of the clients connect via IPv6.
There even is technical reasons! ;)
Software wise I think IPv6 support tends to excellently work as a first indicator for its quality and how seriously I can consider it for something production ready. If there is no proper IPv6 support it tends to be a good indicator that nobody has been using it for anything too serious yet. Of course nothing stands or falls with that alone. It’s just an indicator. But at least in my opinion it happens to work pretty well.
I imagine internally it can be a good indicator of how well your infrastructure and teams work, how long it takes to properly support IPv6.
The problem is that it is never a “requirement” for anyone. When you compare two products or two vendors, you never have “does it use ipv6?” as a hard requirement (except if you are in the WAN business and buying hardware).
Even if you offer it that feature to your users, that’s very unlikely they’ll use it if they can use ipv4. At best, they’ll configure both…
Well, Apple requires in app reviews, for both iOS and macOS, that the app’s networking works under an IPv6-only environment (I’ve been hit by the problem, so at lest sometimes they check for it). I remember someone told me that the requirement came because some of the networking that iDevices do between them (Internet Sharing, iirc) is IPv6-only under the hood (but it was many years ago).
Not disagreeing on any of that. It’s basically what I meant with having pride, showing you know what you are doing and seeing it more as indicative for quality.
It is similar with portability of software. Usually that’s an indicator for good software quality even if it’s something I will never use on another OS or platform.
If something only works on one system even if it’s the one I’m using to me it makes a worse impression than something that is used over a wide array of platforms.
Of course it’s just indicators but when you have a whole slur of options I tend towards giving ones that work everywhere and support ipv6. Good change logs, good official docs are other indicators to me.
Having ‘pride’ and ‘quality’ are all well and good, but since IPv6 provides me little or nothing over IPv4, and would be a huge amount of work to implement and support, I appreciate options that focus on things that matter. At best, IPv6 support is a “nice to have” in my world.
Given the rise of kubernetes and docker swarm and the like and people still managing to conflict with internally used 172.16/12 networks, I can’t fathom how anyone can claim IPv6 provides little utility. Just being able to yeet anything in the big ULA space or even a /64 block for a whole subnet has to be massive utility. I know my employer’s network team fights me when I ask for /22 prefix (v4) so they must feel the pain too.
still managing to conflict with internally used 172.16/12 networks
There are nearly 18m RFC1918 v4 addresses (10/8, 172.16/12, 192.168/16). Either your networking folks haven’t figured out how to manage hiding one RFC1918 space from another, or are you are running kubernetes and docker swarm environments where you need more than that and they need to route to each other individually?
Yes, I’ve run into networking teams that jealously guard ‘their’ IP spaces (and I’ve seen them do it with v6 as well, which is unfathomable). But that is a social/political issue, not a technical issue.
If you’re interacting with addresses in a more detailed fashion than struct sockaddr_storage, it becomes a bunch of extra codepaths you have to build and test. This isn’t just in C-languages either, it’s kind of annoying in Crystal, Ruby, Elixir, and presumably others too.
In C, things like getaddrinfo let you totally abstract away the details of the network protocol and just specify service names and host names. The OS and libc will transparently switch between IPv4, IPv6, or things like IPX and DECNET without you caring. The socket API was designed specifically to avoid embedding protocol knowledge in the application and to allow transport agility (when it was created, IP was not the dominant protocol and it needed to support quite a large number).
Moving from TLS to QUIC or HTTP/1.1 to HTTP/2 are far more work in the application.
It’s been a very long time, but back in the win16 days I wrote WinSock code that worked with IP and IPX. The only thing that made it hard was that there was no getaddrinfo back then so I had to have different code paths for name lookup.
Years ago I thought the lack of restrictions were a sign of simple and clean design to be held up as a badge of honor compared to more limited operating systems. Now that I am responsible for production shell scripts I am a firm supporter of your view that filenames should be UTF-8 with no control characters.
All of UTF-8 seems way too much to me. Call me an authoritarian, but my stance is that files should have been required to start with an alphanum ASCII character, and include only those along with dot, hyphen, and underscore.
I mean… there are entire countries of people who don’t use the latin script. The proposal seems strict but perhaps workable for English speakers, but it’s neither fair nor workable in a global context.
Just like DNS, I don’t think anyone is stopping people from using the idn rules and punycoding generally for a different display of filenames. Honestly, one could just use gibberish and set a xattr of display_name= for all of your non ASCII or latin-1 needs.
But in all honesty, I’d just rather keep the status quo of near full 8-bit clean (NUL and / being the only restrictions) file names. Working restrictions just don’t seem worth the effort.
What’s the motivation for allowing control characters as legal in Unix filenames? I struggle to see the inclusion of newline in a filename as anything other than a nuisance or the result of user error.
I mean, Microsoft’s old standard of 8.3 filenames was technically simple. I don’t understand why it’s not the most prevalent file naming convention today!
Eight characters isn’t long enough to create a meaningful title. 255 alphanumeric characters is.
I have softened a bit on this since yesterday. French speakers can pretty much always manage using the Latin alphabet sans diacritics, but more distant languages might actually need their local character set.
Still tho, the complexity difference between ASCII-only and UTF-8 is énorme, as /u/david_chisnall eloquently expressed in another comment. For a “Westerner” it would be a high price to pay for features that I don’t need.
more distant languages might actually need their local character set
How extraordinarily generous of you.
For a “Westerner” it would be a high price to pay for features that I don’t need.
Your own blog uses UTF-8, so this sounds a bit hypocritical to me.
UTF-8/Unicode is complex, true, but the purpose of software development is taming complexity. Denying billions of people the ability to create filenames in the script of their choosing in order to make a few software engineer’s life a bit easier is abdicating that responsibility.
The problem is the cases where the correct handling is unclear. 7-bit ASCII is unambiguous but also unhelpful. 8-bit character encodings are simple but the meaning of the top half of the character set depends on some out of-band metadata. If you use them and switch between, say, a French and Greek locale, they will be displayed (and sorted) differently.
Once we get to Unicode, things get far worse. First, you need to define some serialisation. UTF-32 avoids some of the problems but has two big drawbacks: It requires 4 bytes per character (other encodings average 1-3) and it is not backwards compatible with anything. On *NIX, the APIs were all defined in terms of ASCII, on Windows they were defined for UCS-2, so these platforms want an encoding that allows the same width code unit to be used. This is where the problems start. Not every sequence of 8-bit values is valid UTF-8 and not every sequence of 2-byte values is a valid UTF-16 string. This means, at least, the kernel must do some validation. This isn’t too painful but it does cause some extra effort in the kernel. It’s worse when you consider removable media: what should you do if you encounter a filesystem written on a system that treats file names as 8-bit code page names and contains files with invalid UTF-8?
Beyond that, you discover that Unicode has multiple ways of representing the same human-readable text. For example, accented characters can often be represented as a single code point or as a base character and an accent combining diacritic. If I create a file in one form and try to open it with the other form, what should happen? For extra fun, input methods in different apps may give different forms and if you type the text with one form into an app that does canonicalisation then copy it you will get the other form, so copy and paste of file name might be broken if you treat filenames as a bag of bytes.
If you want to handle these non-canonical cases then you need to handle canonicalisation of any path string coming into the kernel. The Unicode canonicalisation algorithm is non-trivial, so that’s a bunch more ring 0 code. Worse, newer versions of the standard add new rules, so you need to keep it up to date and you may encounter problems where a filesystem is created with a newer kernel and opened on an older one.
It’s not totally clear to me how much of this complexity belongs in the filesystem and how much in the VFS layer. Things like canonicalisation of paths feels like it should be filesystem-independent. In particular, in something with a UNIX filesystem model, a single path can cross many different filesystems (junctions on NTFS make this possible on Windows too) and so there needs to be a canonical concept of a path that filesystem drivers belong.
Oh, sure, it’s going to be more complex. But that to me is pretty much the paradigmatic example of “Worse is Better”, and I tend to live on the “better” side of that divide.
Yeah it’s funny to discover things like “git allows whatever the shell does, except even more so” (AFAIK the only limitation Git itself puts on tree entry names is that they don’t contain a NUL byte, they can even contain a / though I’ve not checked what that does yet).
I feel like I’m missing the most important piece of demonstration: content which include the delimiters. Namely, how do I deal with contents including a :; I didn’t look exhaustively, but I tried to look through the full readme, skimmed the spec, and some of the tests, and it’s not clear to me how I escape those delimiters, or when I have to… Maybe, at minimum, this could use a better example at the top of the readme. ☺
[Edit:] Also, I didn’t wish to be mean about the submission, but ASV seems like it solved this problem a really long time ago, and the biggest challenge is to make typing/viewing those characters more accessible—which, honestly, just shouldn’t be that hard to accomplish.
The singlequote character is used in very specific contexts as an escape, which addresses a couple of your points. But yes, it’s not clear from the specification how to embed a colon in the middle of a key, e.g.:
The linked Wikipedia page says that the collision problem can be solved by escaping, so I fail to see what makes Deco better than any other format where the delimiters can be escaped.
biggest challenge is to make typing/viewing those characters more accessible
If they’re accessible I feel like they lose their value. If the reason they haven’t taken off is because people can’t see them/type them, if you fix that and they take off you’re back to the “how do I embed HTML content in my XML problems,” just with different characters than <.
I agree with you. When I tried playing around with golang, I actually implemented this for myself and grokked the c++ source to understand a lot of the unspecified behavior like the x: y style name value pairs.
Poke might also be interesting. I have been experimenting with yq since it allows (horror) editing yaml comments to make a sort of parameterized yaml experience so I’m glad more people are taking my extensive jq experience and making more tools for it.
One cool thing about this is it’s another alternative to avoid any bundling or build steps. https://github.com/nestarz/heritage seems like it downloads packages from unpkg and will also automatically generate your import map as well. I might have to investigate this as I personally find build steps in js to leave a bad taste in my mouth.
Not to be too nitpicky but I’d like to defend the whole full js stack crowd
But requiring JavaScript as the only server language because you’ve built up a monster of a framework/build system/hosting infra/module ecosystem that’s JavaScript + JavaScript + JavaScript + JavaScript (and apparently now TypeScript + TypeScript + TypeScript + TypeScript) is not only a huge burden to place on the world-wide web developer industry, it’s highly exclusionary
This example is only simple constants. I don’t even know what I’d have to do if I wanted to share, say, the code that actually renders a bitmap/png. Over this weekend I was going to explore wasm-bindgen style webdev just to see how far I can take it.
sure, and it will probably be a configuration file that will also require a special defined value so it doesn’t put it somewhere like $HOME/go
It’s just another nightmare of nightmares.
To make it more clear, it just sounds like my Go block will get bigger to force these various anti-user changes.
Example of how I tame npm below:
# BEGIN js #
if hash npm 2>/dev/null; then
export NPM_CONFIG_USERCONFIG="$XDG_CONFIG_HOME/npm/config"
if [[ ! -f $XDG_CONFIG_HOME/npm/config ]]; then
mkdir -p $XDG_CONFIG_HOME/npm $XDG_DATA_HOME/npm
{
print -- "prefix=$XDG_DATA_HOME/npm"
print -- "cache=$XDG_CACHE_HOME/npm"
} > $NPM_CONFIG_USERCONFIG
fi
path=($XDG_DATA_HOME/npm/bin $path)
fi
GOPATH still defaults to ~/go, but all the other Go settings live in https://pkg.go.dev/os#UserConfigDir which is $XDG_CONFIG_HOME on Unix machines. The distro specific go.env would be saved in GOROOT along with the executables, so probably /usr/bin/go or something.
Sorry, but that a really bad take these days. The webauthn/fido 2fa protects you from putting the password in the wrong page in the first place. Given enough time, you will get successfully phished. But fido will not sign the auth response to the wrong domain.
Then there are also all the ways your password security doesn’t matter: bugs in software that generated it, recovery from captured traffic, database dumps, hacked desktop, etc. A strong and unique password helps. It’s not enough if you get personally targeted for access.
Using an auto-filling password manager is honestly enough for a technical person to reduce their chance of being phished to near-zero. 99% of the time I get domain-checked password filling and that is enough to reduce the fatigue that I can be super careful and suspicious the last 1% of the time.
Don’t get me wrong, U2F is better (and I use it for a few extra sensitive services) but I’m not sure if I agree with “Given enough time, you will get successfully phished”
I have nothing against 2FA, but since so many 2FA systems are bad or broken it underscores that one should not give up and make a crap password because “2FA will save you”.
Two factor? Push notifications are the password less future! Have some crusty app that needs a password and a second factor? No problem! Just get some generated password from confirming a push notification and for the second factor …. Just confirm another push notification request…This is literally how my workplace does authentication now. It’s insane. I wouldn’t even be surprised if the Uber employee was bit by the same issue.
Honestly I’d rather just use a kdf keyed with the hash of a master password and the webpki authenticated domain name, hashed then concat, of the service I’m authenticating into. Still sucks for cases where I can’t control the accurate presentation of domain names or them being authentic, but I feel it’s better than any of the 2FAs I’ve been forced to use. I’ll feel better about hardware stuff when I can feel safe knowing I’m not permenantly screwed if all of my hardware keys magically got destroyed simultaneously in some unfortunate boating accident.
Snippet straight from my .zshrc for those who like xdgisms
# BEGIN js #
if hash npm 2>/dev/null; then
export NPM_CONFIG_USERCONFIG="$XDG_CONFIG_HOME/npm/config"
if [[ ! -f $XDG_CONFIG_HOME/npm/config ]]; then
mkdir -p $XDG_CONFIG_HOME/npm $XDG_DATA_HOME/npm
{
print -- "prefix=$XDG_DATA_HOME/npm"
print -- "cache=$XDG_CACHE_HOME/npm"
} > $NPM_CONFIG_USERCONFIG
fi
path=($XDG_DATA_HOME/npm/bin $path)
fi
if hash yarn 2>/dev/null; then
alias yarn='yarn --use-yarnrc $XDG_CONFIG_HOME/yarn/config '
# force some other XDG stuff
if [[ ! -f $XDG_CONFIG_HOME/yarn/config ]]; then
mkdir -p $XDG_CONFIG_HOME/yarn $XDG_DATA_HOME/yarn
{
print -- "prefix \"$HOME/.local\""
print -- "yarn-offline-mirror \"$XDG_DATA_HOME/yarn/local-mirror\""
} > $XDG_CONFIG_HOME/yarn/config
fi
fi
# END js #
on the package.json limitations of npm:
yarn (v1) actually does create a package.json for things installed globally at “$prefix/global/package.json”. Shame devs apparently hate global installs? I think neo-yarn is all about the new zipfile modules and stuff now?
But it’s worth looking into yarn v1 if you want this behavior or just want to globally install stuff.
I ran into my share of flatpak woes around steam, pressure vessel runtime and trying to use Mesa 22.x just this weekend. It’s good people are posting what they’re doing to fix these issues since I was going in almost completely blind for my own as well.
In terms of spotify specifically, you can just avoid the app altogether if the web experience is good enough:
This is like cribsheets to me as im in the same boat. I didn’t know about scoop and persistentwindows and stuff like that. Hell I just found out today that windows ships with bsdtar and openssh out of the box.
When I wrote it, it was much slower (~10MiB/s I think?) and the use-case was basically writing a harddrive from start to end with random junk. Around that time I just thought writing random patterns was slower than zeros or ones and I dug into it and basically found using openssl’s AES in CTR mode as a csprng was much faster than the plain dumb case of just read() from /dev/urandom => write() /dev/my-disk here.
Curious. With 5.16 on my laptop (Debian Sid), I easily surpass 200 MiB/s, but with 5.16 on my workstation (also Debian Sid), I can’t get up above 40 MiB/s. The laptop is an Intel Core i7 8650 while my workstation is an AMD Ryzen 5 3400G. Could there be a bug with AMD Ryzen processors and the CSPRNG performance? I don’t mean to use you as technical support, so feel free to kick me in the right direction.
On 5.16, the RNG mixed the output of RDRAND into ChaCha’s nonce parameter, instead of more sanely putting it into the input pool with a hash function. That means every 64 bytes, there’s a call to RDRAND. RDRAND is extremely slow, especially on AMD, where a 64-bit read actually decomposes into two 32-bit ones.
Thanks. I already have random.trust_cpu=0 passed as a kernel boot argument with that performance. Not sure if there’s anything else I can do to not use RDRAND.
That only means that RDRAND isn’t used at boot to credit its input as entropy. It is still used, however. And until 5.18, it’s just xor’d in, in some places, which isn’t great. You can see a bunch of cleanup commits for this in my tree.
In 5.18, all RDRAND/RDSEED input goes through the pool mixer (BLAKE2s), so there’ll be very little tinfoil hat reasoning left for disabling it all together with nordrand.
This is a very confusing proposal and problem. When I write go, I’m more concerned I have to rely on checked casts just to know if the err I have is what I think it is and I can never guarantee I have an exhaustive set of branches for every possible (possibly even nested and wrapped) error. I don’t see how sugaring over:
really solves the biggest problem I see and I can’t imagine
if err != nil
is a struggle for many.Even ignoring the public Internet, privately, I still see people picking random subnets and being surprised that all of a sudden, a whole work center can’t access the service. Being able to just pick a random ULA is basically a superpower, in it of itself or just having everyone pick from their /64 subnet they’re on.
CeyptoJS, weird “word size” measurements and more? This sounds about right. A lot of the nonfree software I encounter at work use a combination of these sorts of tactics and they’re all very trivially defeated with some knowledge of encryption, browser js debugger and a will to power through. This one is particularly nasty, but ironically, just as Amtrak left comments, I’ve found some of these products leak other, unintended data that help understand code. I wonder if they leak it intentionally.
after programming Go for a long time I write all my code in Rust now, and now that I have
Result
types and the?
operator, my code is consistently shorter and nicer to look at and harder to debug when things go wrong. I dunno. I get that people think that typing fewer characters matters. I just don’t think that. When I program Rust I missfmt.Errorf
anderrors.Is
quite a lot. Yeah, sure, you can useanyhow
andthiserror
but still, it’s not the same. You start out with some functionality in an app and so you useanyhow
andanyhow::Context
liberally and do some nice error message wrapping, but inspecting a value to see if it matches some condition is annoying. Now you keep going and your project matures and you put that code into a crate. Whoops,anyhow
inside of a crate is rude, so you switch the error handling strategy up tothiserror
to make it more easily consumed by other people. Now you’ve got to redo a bunch of error propagating things maybe you’ve got some boilerplate enum stuff to write, whatever; you can’t just move the functionality into a library and call it a day.Take a random error in Rust and ask “does any error in this sequence match some condition” and it’s a lot more work than doing the equivalent in Go. There are things that make me happy to be programming Rust now, but the degree to which people dislike Go’s error system has always seemed very odd to me, I’ve always really liked it. Invariably, people who complain about Go’s error system consistently sprinkle this everywhere:
Which is … bad. This form:
goes a long way, and
errors.Is
anderrors.As
compose with it so nicely that I don’t think I’ve seen an error system that I like as much as Go’s. Sure, I’ve seen lots of error systems that have fewer characters and take less ceremony, I just don’t think that’s particularly important. And I’m sure someone is going to come along and tell me I’m doing it wrong or I’m stupid or whatever, and I just don’t care any more. I quite like the ergonomics of error handling in Go and miss them when I’m using other languages.Isn’t wrapping the errors with contextual information at each function that returns them just equivalent to manually constructing a traceback?
The problem of errors in rust and Haskell lacking information comes from syntax that makes it easy to return them without adding that contextual information.
Seems to me all three chafe from not having (or preferring) exceptions.
no, because a traceback is a map of how to navigate the code, and it requires having the code to be able to interpret it, whereas an error chain is constructed from the vantage point that the error chain alone should contain all of the information needed to understand the fault. An error chain can be understood by an operator who lacks access to the source code or lacks familiarity with the language the program is written in; you can’t say the same thing about a stack trace.
not sure about Haskell but Rust doesn’t really have an error type, it has
Result
. Result is not an error type, it is an enum with an error variant, and the value in the error variant may be of any type. There’s thestd::error::Error
trait, but it’s not used directly; aResult<T, E>
may be defined by someE
that implementsstd::error::Error
or not, but it’s inconsistent. It may or may not have some kind ofErrorKind
situation likestd::io
does to help you classify different types of related errors, it might not.Go’s
error
type is only comparable to Rust’sResult
type if you are confused about one language or the other. Go’serror
type is less likeResult
and more likeVec<Box<dyn std::error::Error>>
, because theerror
type in Go is an Interface and it defines a set of unwrap semantics that lets you unwrap an error to the error that caused it. (it’s actually more like a node in a tree of errors these days so even that is a simplification).Unlike a trait, a Go interface value is sized, it’s essentially a two-tuple with a pointer to the memory of the value and another pointer to a vtable. Every error in the Go standard library and third party ecosystem is always a thing with a fixed memory size and at least one layer of indirection. Go’s error system is more like if the entirety of Rust’s ecosystem used
Result<T>
to meanResult<T, Box<dyn std::error::Error>>
. Since that’s not universal, even if you want to do it in your own projects, you run into friction when combining with projects that don’t follow that convention. Even that is an oversimplification, because downcasting a trait object in Rust is much more fiddly than type switching on an interface value in Go.There are a lot of subtle differences between the two systems that turn out to cause enormous differences in the quality of error handling across the ecosystems. Getting a whole team of developers to have clean error handling practices in Go is straightforward; you could explain the practices to an intermediate programmer in a single sitting and they’d basically never forget them and do it correctly forever. It’s significantly more challenging to get that level of consistency and quality with error propagation, reporting, and handling in Rust.
Extremely interesting. You’re both right: it is manually constructing (a projection of) a continuation, but the nifty point scraps makes is that the error chain is a domain-specific representation of the continuation while the traceback is implementation-specific. Super cool. This connects to a line of thinking rooted in [1] that I’ve been mulling over for ages and would love to follow up on.
Here’s a “fantasy abstract” I wrote years ago on this topic:
[1] Felleisen, Matthias, et al. “Abstract Continuations: A Mathematical Semantics for Handling Full Functional Jumps.” ACM Conf. on LISP and Functional Programming, 1988, pp. 52–62.
Or just give error types a protocol that propagation operators can call to append a call-stack level.
That’s still only convenient syntax for manually constructing an error traceback, no?
Sure, but it gives feature parity with exceptions when using the simple mechanism.
creating a stack trace isn’t free. It’s fair to assume creating a stack trace will require a heap allocation. Performing a series of possibly unbounded heap allocations is a meaningful tax; whether that cost is affordable or not is highly context-dependent. This is one of the reasons why there is a whole category of people who use C++ that consciously avoid using C++ exceptions. One of the benefits of Rust’s system and the
Result
type is that theResult
values themselves may be stack allocated. E.g., if you were inclined to do so, you could define a function that returnsResult<(), u8>
, and that’s just a single stack-allocated value, not an unbounded series of heap allocations. In Go, you could write a function that returns anerror
value, which is an interface, and that error value could be some type whose memory representation is a uint8. That would be either a single heap allocation or it would be stack-allocated; I’m not sure offhand, but regardless, it’s not an unbounded series of heap allocations. The performance implications are pretty far-reaching. Those costs are likely irrelevant in the case of something like an i/o bounded http server, but in something like a game engine or when doing embedded programming, they matter.The value proposition of stack traces is also different when you have many concurrently running stacks, as is typical in a Go program. Do you want just the current goroutine’s stack, or do you want the stacks of all running goroutines? goroutines are also not hierarchical in the scheduler, so “I want the stack of this goroutine and all of its parents” is not a viable option, because the scheduler doesn’t track which goroutines spawned which other goroutine (plus you could have goroutine A spawn goroutine B that spawns goroutine C and goroutine B is long gone and then C hits an error, which stacks do you want a trace of? B is gone, how are you gonna stack trace a stack that no longer exists?). If you want all the stacks, now you’re doing … what, pausing all goroutines every time you create an error value? That would be a very large tax.
Overall, the original post is about the ergonomics of error handling; I think the ergonomics of Go’s system are indeed verbose, but my experience has been that the design of Go’s error handling system makes it reasonably straightforward to produce systems that are relatively easy to debug when things break, so generally speaking I think the tradeoff is worthwhile. Sure, you can just yeet errors up the chain, but Go doesn’t make that significantly more ergonomic than wrapping the error. I think where the original post misses the point by a wide margin is that it just yeets errors up the chain, which makes them hard to debug. Making the code shorter but the errors less-useful for debugging is not a worthwhile tradeoff, and is a misunderstanding of the design of Go’s error system.
Not necessarily! This is pretty much the optimal case for per-thread freelists. Most errors are freed quickly and don’t overlap.
Creating a stack trace means calling both runtime.Callers (to get the initial slice of program counter uintptrs) and runtime.CallersFrames (to transform those uintptrs to meaningful frame data). Those calls are not cheap, and (I’m not 100% confident but I’m pretty sure that) at least CallersFrames will allocate.
It’s definitely way way cheaper to fmt.Errorf.
To create a stacktrace up from your current location, sure. But the unwind code you’re emitting knows what its actual location is, so as the error is handed off through each function, each site can just push its pc or even just a location string onto the backtrace list in the error value. And you can gather the other half of the stacktrace when you actually go to print it (if you even want it), which means until then it’s basically free: one compare, one load, two store or thereabouts. And hence, errors that aren’t logged don’t cost you anything.
I think this is the point you may have missed further up asking about the analogy to exceptions and tracebacks. The usual argument I see in favor of something like Go’s error “handling” is that it avoids the allocations associated with exceptions and tracebacks. But then the standard advice for writing Go is that basically every level of the stack ought to be using error-wrapping tools that likely allocate anyway. So it’s a sort of false economy – in the end, anyone who’s programming Go the “right”/recommended way is paying a resource and performance price comparable to exceptions and tracebacks.
I see your point
The thing that bit me recently was that Go uses the pattern of returning an error and a value so much that it is now encoded in tooling. One of the linters throws an error if you check that an error is not nil, but then return a valid result and no error. Unfortunately, there is nothing in the type system to say ‘I have handled this error’ and so this raised a false positive that can be addressed only with an annotation where you have code of the form ‘try this, if it raises an error then do this instead’. I don’t know how you’re supposed to do that in idiomatic Go.
yeah but that sounds like a problem with the linter, not a problem with the language. That’s not reasonable linting logic at all.
That whole part of the Rust error handling experience is such a mess. But then again nobody seems to think it’s an issue, really?
basically every person I work with thinks it’s an issue, because we write production Rust code in a team setting and we’re on call for the stuff we write. A lot of the dialogue about Rust here and elsewhere on the web is driven by people that aren’t using Rust in a production/team setting, they’re using it primarily for side projects or solo projects or research or are looking at it from a PL theory standpoint; these problems don’t come up that much in those settings. And that’s not to dismiss those settings; I personally do my side projects in Rust instead of Go right now because I find writing Rust really fun and satisfying, and this problem doesn’t bother me much on my personal projects. Where it really comes up is when you’re on call and you get paged and you’re trying to understand the nature of an error in a system that you didn’t write, while it is on fire.
Yeah okay but mean time people are writing dozens of blog posts about how async rust is broken (which really I don’t think it is) and this issue barely gets any attention.
How exactly is thiserror and pattern matching on an error variant not strictly superior, if not, at worst, the exact same as fmt.Errorf and errors.Is? The whole point of thiserror is allowing you to write format strings for the error variants you return or are wrapping.
errors.Is
is not simply an equality check, it recursively unwraps the error, finding if any error in the chain (or tree, if an unwrap returns a slice) matches, so you can always safely add a layer of wrapping without affecting anyerrors.Is
checks on that error. https://pkg.go.dev/errors#IsI wonder whether Rust’s not having this is due to Rust’s choice to have a more minimal standard library or simply to no one’s having proposed it yet. I think something like this would be very simple to implement in the Rust standard library (and only slightly more verbose to implement outside it).
The tradeoff here is that it is breaking the abstraction of a function. For example changing underlying http library may change the error type. Your options are to break the error chain and the implementation-specific diagnostics or expose that to callers.
Rust cares a lot about being explicit a out API breakage so it doesn’t surprise me that this isn’t a part of the standard library.
It definitely is convenient for within your own code but across stable API boundaries it is a much worse tradeoff.
Thanks for the warning.
Looks like, if serde-rs/serde#2597 doesn’t get merged quickly and another suitable compromise can’t be reached, I’ll be pinning
serde = "<=1.0.171"
in myCargo.toml
(liketime
has) until rust-lang/compiler-team#475 is implemented.Crazy that
time
pinned it, that’s a hugely widespread library. I wonder what percentage of Rust code will inherit the pin fromtime
, definitely double digits.Looks like matklad also pinned the version used in rust-analyzer about an hour ago:
https://github.com/rust-lang/rust-analyzer/pull/15482
I will probably add it to the next notify release too
*nod* Which reminds me. Given that not all my projects use Serde, but dependencies might, I should also add a
[bans]
entry to mycargo-deny
deny.toml
files.[[bans.features]]
isn’t perfect though. If serde-rs/serde#2597 gets merged, I’ll have to see if they’d be up for adding afeatures.require
key so I don’t have to babysit afeatures.exact
.… and now the use of precompiled proc macros has been reverted half-way through my efforts to push out said change to all my repos… luckily, I got bored at the half-way point and was working on a “script” (i.e. really a sloppily-written Rust program) which I can just complete in a different direction instead to automate the bumping of
Cargo.toml
toserde = "1.0.184"
and the change of thedeny.toml
ban to>=1.0.172, <1.0.184
.…and probably lean into being nerd-sniped into cleaning it up as a reusable tool. I’ve come this far, so I might as well finish its transformation into something I can use next time I need to easily batch-bump
Cargo.toml
anddeny.toml
across all my repos for reasons Dependabot doesn’t cover.I don’t want to use this as tech support, but how exactly is Cargo.toml and the lockfile supposed to work then? I did exactly what you specified, but
cargo tree
still shows[email protected]
, andcargo update --precise [email protected]
doesn’t let me specify .I was just quoting what they said. My bad.
The syntax
time
uses inCargo.toml
isversion = ">= 1.0.126, <= 1.0.171"
and Cargo’s resolver won’t behave the way you want if you don’t also specify a lower bound. (In my case, specifying only an upper bound “upgraded” the project in question from one unified version in the1.0.1xx
range to0.9.15
and1.0.183
… probably because it settled on0.9.15
for some reason and that’s too low a version to satisfy one of my other dependencies at the version I specified)That’s why i’m also going to use
cargo-deny
to warn me when I need to replace or pin another dependency to avoid having it transparently start pulling in a second copy of a newer Serde because a dependency bumped its minimum.As for
cargo update
,--precise
augments--package
. It doesn’t replace it.cargo update --package serde --precise 1.0.171
…or
cargo update -p serde --precise 1.0.171
for short.Ironically, the times I actually needed (or would be greatly helped) by kermit, I never had it on the remote. If you’re exceedingly crafty, you can usually do a lot of good (or evil) with Tcl expect module. I’d write shell snippets to set echo off, raw mode, and spawning
base64 -d > rx
and other such things and it generally worked ok. Sometimes you can manage to really write fast to the serial port and overrun the buffer on the receiving end so I usually set the baud rate in the lower end. It probably explains why some of these serial port tools had some form of CRC or error correction.EDIT: I use to use this a lot, when I dealt with debug serial consoles on various devices… https://github.com/adedomin/dot-files/blob/master/.config/zsh/util-bin/termview The idea is you’d just throw tcl code into your config as a key bind.
I don’t think the the price will be high enough for companies to move all that soon.
I do however think that with IPv6 being old, being required by various institutions (US military if I remember) and how IPv4 is used these days being through a ton of a hacks it might be worthwhile to consider “shaming” on everything that either calls themselves internet related (ISPs, Cloud Providers) or modern, cutting edge, etc. (from Apple to GitHub) basically.
In my humble opinion it’s also a question of trust. Do I really want to trust the software or service that doesn’t manage to have proper IPv6 support for decades? Even if it wasn’t their first priority. There were events like IPv6 Launch Day. Everyone with at least some pride over how they run their network should at least enable it. It’s not rocket science and many individuals and organizations support it. About half of the clients connect via IPv6.
There even is technical reasons! ;)
Software wise I think IPv6 support tends to excellently work as a first indicator for its quality and how seriously I can consider it for something production ready. If there is no proper IPv6 support it tends to be a good indicator that nobody has been using it for anything too serious yet. Of course nothing stands or falls with that alone. It’s just an indicator. But at least in my opinion it happens to work pretty well.
I imagine internally it can be a good indicator of how well your infrastructure and teams work, how long it takes to properly support IPv6.
The problem is that it is never a “requirement” for anyone. When you compare two products or two vendors, you never have “does it use ipv6?” as a hard requirement (except if you are in the WAN business and buying hardware).
Even if you offer it that feature to your users, that’s very unlikely they’ll use it if they can use ipv4. At best, they’ll configure both…
Well, Apple requires in app reviews, for both iOS and macOS, that the app’s networking works under an IPv6-only environment (I’ve been hit by the problem, so at lest sometimes they check for it). I remember someone told me that the requirement came because some of the networking that iDevices do between them (Internet Sharing, iirc) is IPv6-only under the hood (but it was many years ago).
cfr. Apple Dev Forums
Not disagreeing on any of that. It’s basically what I meant with having pride, showing you know what you are doing and seeing it more as indicative for quality.
It is similar with portability of software. Usually that’s an indicator for good software quality even if it’s something I will never use on another OS or platform.
If something only works on one system even if it’s the one I’m using to me it makes a worse impression than something that is used over a wide array of platforms.
Of course it’s just indicators but when you have a whole slur of options I tend towards giving ones that work everywhere and support ipv6. Good change logs, good official docs are other indicators to me.
Having ‘pride’ and ‘quality’ are all well and good, but since IPv6 provides me little or nothing over IPv4, and would be a huge amount of work to implement and support, I appreciate options that focus on things that matter. At best, IPv6 support is a “nice to have” in my world.
Given the rise of kubernetes and docker swarm and the like and people still managing to conflict with internally used 172.16/12 networks, I can’t fathom how anyone can claim IPv6 provides little utility. Just being able to yeet anything in the big ULA space or even a /64 block for a whole subnet has to be massive utility. I know my employer’s network team fights me when I ask for /22 prefix (v4) so they must feel the pain too.
still managing to conflict with internally used 172.16/12 networks
There are nearly 18m RFC1918 v4 addresses (10/8, 172.16/12, 192.168/16). Either your networking folks haven’t figured out how to manage hiding one RFC1918 space from another, or are you are running kubernetes and docker swarm environments where you need more than that and they need to route to each other individually?
Yes, I’ve run into networking teams that jealously guard ‘their’ IP spaces (and I’ve seen them do it with v6 as well, which is unfathomable). But that is a social/political issue, not a technical issue.
How come?
If you’re interacting with addresses in a more detailed fashion than
struct sockaddr_storage
, it becomes a bunch of extra codepaths you have to build and test. This isn’t just in C-languages either, it’s kind of annoying in Crystal, Ruby, Elixir, and presumably others too.In C, things like getaddrinfo let you totally abstract away the details of the network protocol and just specify service names and host names. The OS and libc will transparently switch between IPv4, IPv6, or things like IPX and DECNET without you caring. The socket API was designed specifically to avoid embedding protocol knowledge in the application and to allow transport agility (when it was created, IP was not the dominant protocol and it needed to support quite a large number).
Moving from TLS to QUIC or HTTP/1.1 to HTTP/2 are far more work in the application.
The OS and libc will transparently switch between IPv4, IPv6, or things like IPX and DECNET without you caring.
Have you ever actually tried to do that? I mean beyond lowest-common-denominator, best effort packet delivery. For IPX or DECNet, it wasn’t much fun.
edit: clarity
It’s been a very long time, but back in the win16 days I wrote WinSock code that worked with IP and IPX. The only thing that made it hard was that there was no getaddrinfo back then so I had to have different code paths for name lookup.
It’s not just the shell and it’s not just whitespace; Unix/POSIX file paths are full of dangers.
Thanks for that! Near the top there’s this quote:
All of UTF-8 seems way too much to me. Call me an authoritarian, but my stance is that files should have been required to start with an alphanum ASCII character, and include only those along with dot, hyphen, and underscore.
And I’m not sure about the hyphen.
I mean… there are entire countries of people who don’t use the latin script. The proposal seems strict but perhaps workable for English speakers, but it’s neither fair nor workable in a global context.
Just like DNS, I don’t think anyone is stopping people from using the idn rules and punycoding generally for a different display of filenames. Honestly, one could just use gibberish and set a xattr of display_name= for all of your non ASCII or latin-1 needs.
But in all honesty, I’d just rather keep the status quo of near full 8-bit clean (NUL and / being the only restrictions) file names. Working restrictions just don’t seem worth the effort.
What’s the motivation for allowing control characters as legal in Unix filenames? I struggle to see the inclusion of newline in a filename as anything other than a nuisance or the result of user error.
“Worse is better”. It’s less that they’re allowed, and more that only the nul byte and the ascii solidus (0x2F) are forbidden.
And frankly I would not be surprised if dir level API let you create an entry with a / in it.
That would seem to fall down for any use case such as “I’ve got three files beginning with this native character I want to match with a glob”
We can deal with it just fine in exchange for the technical simplicity.
I mean, Microsoft’s old standard of 8.3 filenames was technically simple. I don’t understand why it’s not the most prevalent file naming convention today!
Eight characters isn’t long enough to create a meaningful title. 255 alphanumeric characters is.
I have softened a bit on this since yesterday. French speakers can pretty much always manage using the Latin alphabet sans diacritics, but more distant languages might actually need their local character set.
Still tho, the complexity difference between ASCII-only and UTF-8 is énorme, as /u/david_chisnall eloquently expressed in another comment. For a “Westerner” it would be a high price to pay for features that I don’t need.
Because “Westerners” are the only ones that matter in computing?
More like: I’m the only one who matters when I’m not being paid to write code.
How extraordinarily generous of you.
Your own blog uses UTF-8, so this sounds a bit hypocritical to me.
UTF-8/Unicode is complex, true, but the purpose of software development is taming complexity. Denying billions of people the ability to create filenames in the script of their choosing in order to make a few software engineer’s life a bit easier is abdicating that responsibility.
Of course it does. Text needs to work in every language.
But here we’re talking about filesystems and filenames; the contents of the files are arbitrary sequences of bytes and also not the point?
ASCII-only is a non-starter. If you tell people “the system forbids you using your own preferred language”, they will go use some other system.
File names belong to the user. If the system can’t handle them, that’s a defect, full stop.
The problem is the cases where the correct handling is unclear. 7-bit ASCII is unambiguous but also unhelpful. 8-bit character encodings are simple but the meaning of the top half of the character set depends on some out of-band metadata. If you use them and switch between, say, a French and Greek locale, they will be displayed (and sorted) differently.
Once we get to Unicode, things get far worse. First, you need to define some serialisation. UTF-32 avoids some of the problems but has two big drawbacks: It requires 4 bytes per character (other encodings average 1-3) and it is not backwards compatible with anything. On *NIX, the APIs were all defined in terms of ASCII, on Windows they were defined for UCS-2, so these platforms want an encoding that allows the same width code unit to be used. This is where the problems start. Not every sequence of 8-bit values is valid UTF-8 and not every sequence of 2-byte values is a valid UTF-16 string. This means, at least, the kernel must do some validation. This isn’t too painful but it does cause some extra effort in the kernel. It’s worse when you consider removable media: what should you do if you encounter a filesystem written on a system that treats file names as 8-bit code page names and contains files with invalid UTF-8?
Beyond that, you discover that Unicode has multiple ways of representing the same human-readable text. For example, accented characters can often be represented as a single code point or as a base character and an accent combining diacritic. If I create a file in one form and try to open it with the other form, what should happen? For extra fun, input methods in different apps may give different forms and if you type the text with one form into an app that does canonicalisation then copy it you will get the other form, so copy and paste of file name might be broken if you treat filenames as a bag of bytes.
If you want to handle these non-canonical cases then you need to handle canonicalisation of any path string coming into the kernel. The Unicode canonicalisation algorithm is non-trivial, so that’s a bunch more ring 0 code. Worse, newer versions of the standard add new rules, so you need to keep it up to date and you may encounter problems where a filesystem is created with a newer kernel and opened on an older one.
All I will say is that this seems more to point out that it is filesystem dependent and that filesystems are too complex to run in kernel as ring 0.
Something we already knew but it adds one more reason…
It’s not totally clear to me how much of this complexity belongs in the filesystem and how much in the VFS layer. Things like canonicalisation of paths feels like it should be filesystem-independent. In particular, in something with a UNIX filesystem model, a single path can cross many different filesystems (junctions on NTFS make this possible on Windows too) and so there needs to be a canonical concept of a path that filesystem drivers belong.
Oh, sure, it’s going to be more complex. But that to me is pretty much the paradigmatic example of “Worse is Better”, and I tend to live on the “better” side of that divide.
Yeah it’s funny to discover things like “git allows whatever the shell does, except even more so” (AFAIK the only limitation Git itself puts on tree entry names is that they don’t contain a NUL byte, they can even contain a
/
though I’ve not checked what that does yet).I feel like I’m missing the most important piece of demonstration: content which include the delimiters. Namely, how do I deal with contents including a
:
; I didn’t look exhaustively, but I tried to look through the full readme, skimmed the spec, and some of the tests, and it’s not clear to me how I escape those delimiters, or when I have to… Maybe, at minimum, this could use a better example at the top of the readme. ☺[Edit:] Also, I didn’t wish to be mean about the submission, but ASV seems like it solved this problem a really long time ago, and the biggest challenge is to make typing/viewing those characters more accessible—which, honestly, just shouldn’t be that hard to accomplish.
All the best,
-HG
The singlequote character is used in very specific contexts as an escape, which addresses a couple of your points. But yes, it’s not clear from the specification how to embed a colon in the middle of a key, e.g.:
This may have just been an oversight in the spec?
The linked Wikipedia page says that the collision problem can be solved by escaping, so I fail to see what makes Deco better than any other format where the delimiters can be escaped.
If they’re accessible I feel like they lose their value. If the reason they haven’t taken off is because people can’t see them/type them, if you fix that and they take off you’re back to the “how do I embed HTML content in my XML problems,” just with different characters than <.
I agree with you. When I tried playing around with golang, I actually implemented this for myself and grokked the c++ source to understand a lot of the unspecified behavior like the
x: y
style name value pairs.https://github.com/adedomin/indenttext
Maybe my examples and test cases better describe how I interpreted it?
http://www.jemarch.net/poke
Poke might also be interesting. I have been experimenting with yq since it allows (horror) editing yaml comments to make a sort of parameterized yaml experience so I’m glad more people are taking my extensive jq experience and making more tools for it.
One cool thing about this is it’s another alternative to avoid any bundling or build steps. https://github.com/nestarz/heritage seems like it downloads packages from unpkg and will also automatically generate your import map as well. I might have to investigate this as I personally find build steps in js to leave a bad taste in my mouth.
Not to be too nitpicky but I’d like to defend the whole full js stack crowd
Perhaps an extreme example because I wanted this to be all a constexpr, but the level of bending over I did to mimick what I could do in a js server easily made me kind of realize why full stack js is a thing: https://github.com/adedomin/moose2/blob/sqlite3/server/src/shared_data.rs
This example is only simple constants. I don’t even know what I’d have to do if I wanted to share, say, the code that actually renders a bitmap/png. Over this weekend I was going to explore wasm-bindgen style webdev just to see how far I can take it.
For those who want to opt-out:
export GOTELEMETRY=off
(I think this is the environment flag, not sure)
Recent versions of Go respect an env var if it is set but also let you set config with
go env -w whatever
.Oh yay, another environment variable in my list of hundreds.
There are ways to set it besides using an ENV var. They’re going to add a thing so that Linux distros can disable it by default, for example.
sure, and it will probably be a configuration file that will also require a special defined value so it doesn’t put it somewhere like $HOME/go It’s just another nightmare of nightmares.
To make it more clear, it just sounds like my Go block will get bigger to force these various anti-user changes.
Example of how I tame npm below:
GOPATH still defaults to ~/go, but all the other Go settings live in https://pkg.go.dev/os#UserConfigDir which is $XDG_CONFIG_HOME on Unix machines. The distro specific go.env would be saved in GOROOT along with the executables, so probably /usr/bin/go or something.
Note that none of the 2FA attacks matter if your password is secure. So using secure and unique passwords is still very effective!
Sorry, but that a really bad take these days. The webauthn/fido 2fa protects you from putting the password in the wrong page in the first place. Given enough time, you will get successfully phished. But fido will not sign the auth response to the wrong domain.
Then there are also all the ways your password security doesn’t matter: bugs in software that generated it, recovery from captured traffic, database dumps, hacked desktop, etc. A strong and unique password helps. It’s not enough if you get personally targeted for access.
Using an auto-filling password manager is honestly enough for a technical person to reduce their chance of being phished to near-zero. 99% of the time I get domain-checked password filling and that is enough to reduce the fatigue that I can be super careful and suspicious the last 1% of the time.
Don’t get me wrong, U2F is better (and I use it for a few extra sensitive services) but I’m not sure if I agree with “Given enough time, you will get successfully phished”
I have nothing against 2FA, but since so many 2FA systems are bad or broken it underscores that one should not give up and make a crap password because “2FA will save you”.
A strong password is still necessary to prevent physical device theft being an instant win, yes.
Two factor? Push notifications are the password less future! Have some crusty app that needs a password and a second factor? No problem! Just get some generated password from confirming a push notification and for the second factor …. Just confirm another push notification request…This is literally how my workplace does authentication now. It’s insane. I wouldn’t even be surprised if the Uber employee was bit by the same issue.
Honestly I’d rather just use a kdf keyed with the hash of a master password and the webpki authenticated domain name, hashed then concat, of the service I’m authenticating into. Still sucks for cases where I can’t control the accurate presentation of domain names or them being authentic, but I feel it’s better than any of the 2FAs I’ve been forced to use. I’ll feel better about hardware stuff when I can feel safe knowing I’m not permenantly screwed if all of my hardware keys magically got destroyed simultaneously in some unfortunate boating accident.
I’d feel more comfortable with this if it were an old fashioned HTTP CONNECT proxy instead of how it is implemented.
Snippet straight from my .zshrc for those who like xdgisms
on the package.json limitations of npm: yarn (v1) actually does create a package.json for things installed globally at “$prefix/global/package.json”. Shame devs apparently hate global installs? I think neo-yarn is all about the new zipfile modules and stuff now? But it’s worth looking into yarn v1 if you want this behavior or just want to globally install stuff.
I ran into my share of flatpak woes around steam, pressure vessel runtime and trying to use Mesa 22.x just this weekend. It’s good people are posting what they’re doing to fix these issues since I was going in almost completely blind for my own as well.
In terms of spotify specifically, you can just avoid the app altogether if the web experience is good enough:
This is like cribsheets to me as im in the same boat. I didn’t know about scoop and persistentwindows and stuff like that. Hell I just found out today that windows ships with bsdtar and openssh out of the box.
Does that mean I can retire this script I mockingly wrote eons ago?
Just curious, but what do you need 3.5 GiB/s for where 150 MiB/s isn’t sufficient?
When I wrote it, it was much slower (~10MiB/s I think?) and the use-case was basically writing a harddrive from start to end with random junk. Around that time I just thought writing random patterns was slower than zeros or ones and I dug into it and basically found using openssl’s AES in CTR mode as a csprng was much faster than the plain dumb case of just read() from /dev/urandom => write() /dev/my-disk here.
No, doing aes-ni into a buffered pipe is still going to be a lot faster:
Curious. With 5.16 on my laptop (Debian Sid), I easily surpass 200 MiB/s, but with 5.16 on my workstation (also Debian Sid), I can’t get up above 40 MiB/s. The laptop is an Intel Core i7 8650 while my workstation is an AMD Ryzen 5 3400G. Could there be a bug with AMD Ryzen processors and the CSPRNG performance? I don’t mean to use you as technical support, so feel free to kick me in the right direction.
On 5.16, the RNG mixed the output of RDRAND into ChaCha’s nonce parameter, instead of more sanely putting it into the input pool with a hash function. That means every 64 bytes, there’s a call to RDRAND. RDRAND is extremely slow, especially on AMD, where a 64-bit read actually decomposes into two 32-bit ones.
Thanks. I already have
random.trust_cpu=0
passed as a kernel boot argument with that performance. Not sure if there’s anything else I can do to not use RDRAND.That only means that RDRAND isn’t used at boot to credit its input as entropy. It is still used, however. And until 5.18, it’s just xor’d in, in some places, which isn’t great. You can see a bunch of cleanup commits for this in my tree.
Ah, it’s
nordrand
I need to pass. Sure enough, a 10x performance boost:AMD needs to up their game. Heh.
In 5.18, all RDRAND/RDSEED input goes through the pool mixer (BLAKE2s), so there’ll be very little tinfoil hat reasoning left for disabling it all together with nordrand.