...And What To Do About It
My goal in this post is to skim my observations on the state of software design and development over the past year, and to try to find a meaningful way forward for myself for 2019. My perspective is limited by the fact that I have worked exclusively in client-side software security for the past 7.5 years. Still, I think there are broad trends visible even to me, and some clear signs about where we need to go as an industry.
I hope that this post is useful to a variety of security people: not just engineers, but also UX designers and researchers, project/product/program managers, people and business managers, and operations. In any case, all paths to success require the help of all those kinds of people. This post is even more of a link-fest than usual; I hope that’s useful.
The high-order bit in much of the below is complexity. Hardware, software, platforms, and ecosystems are often way too complex, and a whole lot of our security, privacy, and abuse problems stem from that.
Encrypting the web is going swimmingly! Also, marking non-secure web origins as non-secure, and marking secure origins as neutral, is moving right along. It’s amazing and wonderful that we’ve improved so much so quickly, and it gives me hope for other huge efforts (see below). Thanks as always to Let’s Encrypt, and to the other browsers who are moving in a similar direction!
Although memory corruption vulnerabilities remain prevalent, iOS, Chrome OS, and Chrome are existence proofs that, with good effort in design (privilege reduction) and unreasonably high effort in implementation (actually making privilege reduction work, bug hunting, bug fixing, and rapid deployment of bug fixes), it is just barely possible to significantly raise the cost of exploiting memory corruption vulnerabilities for projects implemented in unsafe languages. Against modern targets, exploiting memory corruption is nowhere near as easy as it was in the 1990s or the 2000s.
iOS continues to have excellent update adoption (see also), even though it’s voluntary — a sign that people perceive the value of the updates. It’s unlikely people are making their choice on the basis of security per se, of course. But security and privacy are key parts of iOS’ value proposition, and I do think at least some customers perceive them.
Memory-safe programming languages dominate the landscape. Additionally, the fastest-growing languages are memory-safe. Some popular languages are even type-safe. (Some might consider type safety a mere bonus, but to me, typefulness is a crucial building block for reliable and safe software.) There is even good news in systems software, previously the unchallenged and most undeserved domain of unsafety: Go is big there, and Rust is boopin’ right along (see e.g. Servo, CrOS VM, the Xi editor, parts of Fuchsia). Although we mourn Midori, it can still teach us broadly applicable, deep lessons. (See especially A Tale Of Three Safeties and The Error Model.)
Memory tagging, a new (and old) feature of hardware, can help with memory safety problems. People are working on making it happen on modern systems (paper). I don’t think it’s a replacement for fixing bugs in as systemic a way as possible (ideally, in the source language), but it has great potential to increase safety.
Static checkers — compilers — and dynamic checkers (e.g. Address Sanitizer
and the rest of the LLVM sanitizers) have advanced very far in the past 20
years. What was once bleeding-edge research now comes for free with
off-the-shelf compilers. This is fantastic! (Start with -Wall
-Werror
in Clang or GCC, but I like to use -Weverything
-Werror
, with a few exceptions like -Wno-padded
.
Really.)
Chrome is making some structural improvements to the extensions platform, which should reduce some of the worst abuses we see in that ecosystem.
Parts of the software industry are having an ethical and moral awakening:
You don’t have to agree with all those positions to find it good news that our generation of engineers is growing beyond the “I could build it, so I did; what are consequences?” mentality. Previous generations had to make very similar choices.
(I do happen to agree with all those positions, and I will not work on machines designed for war or police, nor on Big Brotherly, censored search engines. And I support the efforts for equality and fair treatment for everyone. The Walkout was a good day, but it was just a beginning. There’s a long way to go.)
The increasing awareness and adoption of Universal 2nd Factor authentication is great news. (U2F has been standardized as WebAuthn, which is considerably more complex than most security people would like. Expect bugs to come of that...) The high degree of phishing resistance it offers is at least as important as the protections HTTPS provides. Phishing and account take-over have consistently been 1 of our biggest problems, and WebAuthn can put a big dent in them. You can use it now with Google, Facebook, Twitter, Dropbox, login.gov, and others.
C++ continues to be untenably complex and wildly unsafe:
I can’t possibly select and link to a list of the infinite bug reports whose root causes are memory unsafety. A fun exercise is to skim through a good source of vulnerability write-ups (the Project Zero blog is one of my favorites), and count how many of the bugs are even in the application domain at all.
(Of course, if you find that there are more memory safety bugs than application-domain bugs or other bugs, that could just as well be due to the researchers’ biases. But I think we can all agree that memory corruption bugs simply should not exist at all, yet are numerous and often exploitable.)
Designing a language that achieves all of memory safety, high performance, and good usability remains very hard. The Rust compiler notices and rejects safety bugs that C and C++ compilers don’t notice/can’t notice/purposefully accept. 🤪🔨 That is to Rust’s credit, but this discipline can be extremely difficult to learn.
Among the programming language research community’s goals is proving programs safe. Gradually and increasingly, that work trickles down into real languages that people can really use to ship real software. The difficulty of using academic tools is partly a natural consequence of their small audience, but some of the difficulty is unavoidable: proof of safety means proof, that difficult thing that people get PhDs for. Ultimately, the software engineering community is going to have to commit to meeting this standard, gradually and increasingly.
Obviously, 2018 was the year everyone became aware of Spectre & Meltdown, Foreshadow, L1TF, and the idea of micro-architectural side-channels generally. Shared resources abound, unfortunately. Of course, other show-stopper security problems (typically due to monstrous complexity) have been known for a long time (see also, see also). Although those links refer mostly to Intel Architecture systems, there’s no reason to think that (e.g.) ARM is inherently safer. In particular, the micro-architectural side-channel problems are the natural result of designing for maximum performance — which almost every chip designer is trying to do, because that’s what almost every customer wants.
Abuse (the malicious use of legitimate functionality) affects more people’s lives in more ways than does the exploitation of bugs. Although hacking can have a surprising influence, such as in the form of political fallout or mass data breaches, the reasons your friends and family are sad are much more prosaic — and harder to solve:
Proof-of-work continues not to work 😵, as foretold by prophecy 😑. The coming decades are going to bring increasing climate, uh, ‘challenges’, and all computing systems are going to have to prove their worth relative to the sum of all their costs — including carbon and e-waste. We won’t be able to laugh those off as externalities any longer. Proof-of-work systems will continue to be unable to show sufficient value for the cost, and may even be the wedge for regulation (if they don’t starve themselves or crash first).
The web performance crisis (see also a hotter take 🥵) is a similar situation: hugely wasteful, but not (yet...?) self-limiting. In the past I’ve had to argue that security is affordable even given performance constraints. It was possible to get both performance and security then, by reducing obvious bloat and enabling less-obvious optimizations, and it’s possible now. The root cause then was the same as it is now: too many developers don’t use the same client systems as their userbase does, and they don’t know what network, memory, and CPU costs they are incurring. Previously, those costs were hard to see. Now, they are definitely not: every browser has a very good Dev Tools console, and there is no excuse for not using it.
Dependency slurping systems like NPM, CPAN, go get
, and so on
continue to freak me out. They might potentially be more dangerous than manual
dependency management, despite the huge risks of that practice, precisely
because they make it ‘easy’ to grow your project’s dependency graph — and hence
the number of individuals and organizations that you implicitly trust. (And their
trustworthiness can suddenly change for the worse.) When there are
significant gaps in a language’s standard library, third-party developers will
eagerly fill those gaps with new dependencies for you to (not always knowingly)
inherit. There is an
effort underway to fill gaps in JavaScript’s standard library, which I
strongly support for this reason.
Social media continues to amplify the worst in people, and some executives of social media companies continue to be part of the problem. Dealing with the toxicity and abuse of social media is a long-term, multi-pronged effort, but 1 thing that we can immediately do as engineers, PMs, designers, and managers is to push back on ‘engagement’ as the primary or only metric for ‘success’. It’s game-able and heavily gamed, and does not remotely capture the real experiences of real people on social media. People’s experiences are often profoundly awful, and we as software developers are responsible for dealing with the consequences of what we’ve built. Are we empowering people to learn and grow, or are we amplifying the fuckery of Nazis and Chekists? Clinging to a simplistic view of free speech is not going to get us out of having to answer that question.
Unfortunately for me, I want to work on all of these problems. I had a good fun time in 2018 working on defense at a low-level (just one of many adventures), and there’s still plenty of work to be done there. (There’s lots of ambient privilege still crying out to be reduced!) It has been rewarding to play my small part in helping get HTTPS where it needs to be.
And unfortunately, the problems that I find the most vexing — the abuse category generally — are not in my area of greatest expertise. My heart is really in the language problem: meaningful interfaces, ergonomic and safe libraries, memory safety, and type safety. But it’s the abuse that makes my heart sick.
Still, I see people really shipping software improvements that seemed impossible 20 or 10 or 5 years ago. We really are making progress. Here’s what I want to see in 2019:
Smart people are already hard at work on all these things! We can get the industry closer to where it needs to be, and serve people better. Tomorrow is Monday...