This is a cryptography blog and I always feel the need to apologize for any post that isn’t “straight cryptography.” I’m actually getting a little tired of apologizing for it (though if you want some hard-core cryptography content, there’s plenty here and here.)
Sometimes I have to remind my colleagues that out in the real world, our job is not to solve exciting mathematical problems: it’s to help people communicate securely. And people, at this very moment, need a lot of help. Many of my friends are Federal employees or work for contractors, and they’re terrified that they’re going to lose their job based on speech they post online. Unfortunately, “online” to many people includes thoughts sent over private text messages — so even private conversations are becoming chilled. And while this is hardly the worst thing happening to people in this world, it’s something that’s happening to my friends and neighbors.
(And just in case you think this is partisan: many of the folks I’m talking about are Republicans, and some are military veterans who work for the agencies that keep Americans safe. They’re afraid for their jobs and their kids, and that stuff transcends politics.)
So let me get to the point of this relatively short post.
Apple iMessage is encrypted, but is it “secure”?
A huge majority of my “normie” friends are iPhone users, and while some have started downloading Signal app (and you should too!) — many of them favor one communications protocol: Apple iMessage. This is mostly because iMessage is just the built-in messaging app used on Apple phones. If you don’t know the branding, all you need to know is that iMessage is “blue bubbles” you get when talking to other Apple users.
But encryption in transit is only one part of the story. After delivering your messages with its uncompromising post-quantum security, Apple allows two things that aren’t so great:
iMessages stick around your phone forever unless you manually delete them (a process that may need to happen on both sides, and is painfully annoying.)
iMessages are automatically backed up to Apple’s iCloud backup service, if you have that feature on — and since Apple sets this up as the default during iPhone setup, most ordinary people do.
The combination of these two features turn iMessage into a Star-Trek style Captain’s Log of your entire life. Searching around right now, I can find messages from 2015. Even though my brain tells me this was three years ago, I’m reliably informed that this is a whole decade in the past.
Now, while the messages above are harmless, I want to convince you that this is often a very bad thing. People want to have private conversations. They deserve to have private conversations. And their technology should make them feel safe doing so. That means they should know that their messaging software has their back and will make sure those embarrassing or political or risque text messages are not stored forever on someone’s phone or inside a backup.
We know exactly how to fix this, and every other messenger did so long ago
I’m almost embarrassed to explain what this feature does, since it’s like explaining how a steering wheel works. Nevertheless. When you start a chat, you can decide how long the messages should stick around for. If your answer is forever, you don’t need to do anything. However, if it’s a sensitive conversation and you want it to be ephemeral in the same way that a phone call is, you can pick a time, typically ranging from 5 minutes to 90 days. When that time expires, your messages just get erased — on both your phone and the phones of the people you’re talking to.
A separate feature of disappearing messages is that some platforms will omit these conversations from device backups, or at least they’ll make sure expired messages can’t be restored. This makes sense because those conversations are supposed to be ephemeral: people are clearly not expecting those text messages to be around in the future, so they’re not as angry if they lose them a few days early.
Beyond the basic technical functionality, a disappearing messages feature says something. It announces to your users that a conversation is actually intended to be private and short-lived, that its blast radius will be contained. You won’t have to think about it years or months down the line when it’s embarrassing. This is valuable not just for what it does technically but for the confidence it gives to users, who are already plenty insecure after years of abuse by tech companies.
Why won’t Apple add a disappearing messages feature?
I don’t know. I honestly cannot tell you. It is baffling and weird and wrong, and completely out of step with the industry they’re embedded in. It is particularly bizarre for a company that has formed its entire brand image around stuff like this:
To recap, nearly every single other messaging product that people use in large numbers (at least here in the US) has some kind of disappearing messages feature. Apple’s omission is starting to be very unique.
I do have some friends who work for Apple Security and I’ve tried to talk to them about this. They usually try to avoid me when I do stuff like this — sometimes they mention lawyers — but when I’m annoying enough (and I catch them in situations where exit is impossible) I can occasionally get them to answer my questions. For example, when I ask “why don’t you turn on end-to-end encrypted iCloud backup by default,” they give me thoughtful answers. They’ll tell me how users are afraid of losing data, and they’ll tell me sad stories of how difficult it is to make those features as usable as unencrypted backup. (I half believe them.)
When I ask about disappearing messages, I get embarrassed sighs and crickets. Nobody can explain why Apple is so far behind on this basic feature even as an option, long after it became standard in every other messenger. Hence the best I can do is speculate. Maybe the Apple executives are afraid that governments will pressure them if they activate a security feature like this? Maybe the Messages app is written in some obsolete language and they can’t update it? Maybe the iMessage servers have become sentient and now control Tim Cook like a puppet? These are not great answers, but they are better than anything the company has offered — and everyone at Apple Security kind of knows it.
In a monument to misaligned priorities, Apple even spent time adding post-quantum encryption to its iMessage protocol — this means Apple users are now safe from quantum computers that don’t exist. And yet users’ most intensely private secrets can still be read off their phone or from a backup by anyone who can guess their passcode and use a search box. This is not a good place to be in 2025, and Apple needs to do better.
A couple of technical notes
Since this is a technical blog I feel compelled to say a few things that are just a tad more detailed than the plea above.
First off, Gene Hoffman points me to a small feature in the Settings of your phone called “Keep Messages” (buried under “Messages” in Settings, and then scrolling way down.) This determines how long your messages will be kept around on your own phone. You cannot set this per conversation, but you can set it to something shorter than “Forever”, say, one year. This is a big decision for some people to make, however, since it will immediately delete any old messages you actually cared about.
More importantly (as mentioned in comments) this only affects your phone. Messages erased via this process will remain on the phones of your conversation partners.
Second, if you really want to secure your iMessages, you should turn on Apple’s Advanced Data Protection feature. This will activate end-to-end encryption for your iCloud backups, and will ensure that nobody but you can access those messages.
This is not the same thing as disappearing messages, because all it protects is backups. Your messages will still exist on your phone and in your encrypted backups. But it at least protects your backups better.
Third, Apple advertises a feature called Messages in iCloud, which is designed to back up and sync your messages between devices. Apple even advertises that this feature is end-to-end encrypted!
I hate this phrasing because it is disastrously misleading. Messages in iCloud may be encrypted, however… If you use iCloud Backup without ADP (which is the default for new iPhones) the Messages in iCloud encryption key itself will be backed up to Apple’s servers in a form that Apple itself can access. And so the content of the Messages in iCloud database will be completely available to Apple, or anyone who can guess your Apple account password.
None of this has anything to do with disappearing messages. However: that feature, with proper anti-backup protections, would go a long way to make these backup concerns obsolete.
The British government’s undisclosed order, issued last month, requires blanket capability to view fully encrypted material, not merely assistance in cracking a specific account, and has no known precedent in major democracies. Its application would mark a significant defeat for tech companies in their decades-long battle to avoid being wielded as government tools against their users, the people said, speaking under the condition of anonymity to discuss legally and politically sensitive issues.
Apple’s decision to disable their encrypted cloud backup feature has triggered many reactions, including a few angry takes by Apple critics, accusing Apple of selling out its users:
With all this in mind, I think it’s time to take a sober look at what might really happening here. This will require some speculation and educated guesswork. But I think that exercise will be a lot more helpful to us if we want to find out what’s really going on.
Question 1: does Apple really care about encryption?
Encryption is a tool that protects user data by processing it using a key, so that only the holder of the appropriate key can read it. A variant called end-to-end encryption (E2EE) uses keys that only the user (or users) knows. The benefit of this approach is that data is protected from many threats that face centralized repositories: theft, cyber attacks, and even access by sophisticatedstate-sponsored attackers. One downside of this encryption is that it can also block governments and law enforcement agencies from accessing the same data.
Navigating this tradeoff has been a thorny problem for Apple. Nevertheless, Apple has mostly opted to err on the side of aggressive deployment of (end-to-end) encryption. For some examples:
In 2011, the company launched iMessage, a built-in messaging service with default end-to-end encryption for all users. This was the first widely-deployed end-to-end encrypted messaging service. Today these systems are recommended even by the FBI.
In 2013, Apple launched iCloud Key Vault, which encrypts your backed-up passwords and browser history using encryption that even Apple can’t access.
Apple faced law enforcement backlash on each of these moves. But perhaps the most famous example of Apple’s aggressive stance on encryption occurred during the 2016 Apple v. FBI case, where the company actively fought U.S. government’s demands to bypass encryption mechanisms on an iPhone belonging to an alleged terrorist. Apple argued that satisfying the government’s demand would have required Apple to weaken encryption on all of the company’s phones. Tim Cook even took the unusual step of signing a public letter defending the company’s use of encryption:
I wouldn’t be telling you the truth if I failed to mention that Apple has also made some big mistakes. In 2021, the company announced a plan to implement client-side scanning of iCloud Photos to search for evidence of illicit material in private photo libraries. This would have opened the door for many different types of government-enforced data scanning, scanning that would work even if data was backed up in an end-to-end encrypted form. In that instance, technical experts quickly found flaws in Apple’s proposal and it was first paused, then completely abandoned in 2022.
This is not intended to be a hagiography for Apple. I’m simply pointing out that the company has, in the past, taken major public risks to deploy and promote encryption. Based on this history, I’m going to give Apple the benefit of the doubt and assume that the company is not racing to sell out its users.
This was due to a critical feature of the new law: it enables the U.K. government to issue secret “Technical Capability Notices” that can force a provider, such as Apple, to secretlychange the operation of their system — for example, altering an end-to-end encrypted system so that Apple would be forced to hold a copy of the user’s key. With this modification in place, the U.K. government could then demand access to any user’s data on demand.
By far the most concerning part of the U.K. law is that it does not clearly distinguish between U.K. customers and non-U.K. customers, such as those of us in the U.S. or other European nations. Apple’s lawyers called this out in a 2024 filing to Parliament:
In the worst-case interpretation of the law, the U.K. might now be the arbiter of all cybersecurity defense measures globally. Her Majesty’s Government could effectively “cap” the amount of digital security that customers anywhere in the world can depend on, without users even knowing that cap was in place. This could expose vast amounts of data to state-sponsored attackers, such as the ones who recently compromised the entire U.S. telecom industry. Worse, because the U.K.’s Technical Capability Notices are secret, companies like Apple would be effectively forced to lie to their customers — convincing them that their devices are secure, when in fact they are not.
It goes without saying that this is a very dangerous road to start down.
Question 3: how might Apple respond to a broad global demand from the U.K.?
Let us imagine, hypothetically, that this worst-case demand is exactly what Apple is faced with. The U.K. government asks Apple to secretly modify their system for all users globally, so that it is no longer end-to-end encrypted anywhere in the world.
(And if you think about it practically: that flavor of demand seems almost unavoidable in practice. Even if you imagine that Apple is only being asked only to target users in the U.K., the company would either need to build this capability globally, or it would need to deploy a new version or “zone”1 for U.K. users that would work differently from the version for, say, U.S. users. From a technical perspective, this would be tantamount to admitting that the U.K.’s version is somehow operationally distinct from the U.S. version. That would invite reverse-engineers to ask very pointed questions and the secret would almost certainly be out.)
But if you’re Apple, you absolutely cannot entertain, or even engage with this possibility. The minute you engage with it, you’re dead. One single nation — the U.K. — becomes the governor of all of your security products, and will now dictate how they work globally. Worse, engage with this demand would open a hell-mouth of unfortunate possibilities. Do you tell China and Europe and the U.S. that you’ve given the U.K. a backdoor into their data? What if they object? What if they want one too?
There is nothing down that road but catastrophe.
So if you’re Apple and faced with this demand from the U.K., engaging with the demand is not really an option. You have a relatively small number of choices available to you. In order of increasing destructiveness:
Hire a bunch of very expensive lawyers and hope you can convince the U.K. to back down.
Shut down iCloud end-to-end encryption in the U.K. and hope that this renders the issue moot.
???
Exit the U.K. market entirely.
If we can believe the reporting so far, I think it’s safe to say that Apple has almost certainly tried the legal route. I can’t even imagine what the secret court process in the U.K. looks like (does it involve wigs?) but if it’s anything like the U.S.’s FISA courts, I would tend to assume that it is unlikely to be a fair fight for a target company, particularly a foreign one.
In this model, Apple’s decision to disable end-to-end encrypted iCloud Backup means we have now reached Stage 2. U.K. users will no longer be able to sign up for Apple’s end-to-end encrypted backup as of February 21. (We aren’t told how existing users will be handled, but I imagine they’ll be forced to voluntarily downgrade to unencrypted service, or else lose their data.) Any request for a backdoor for U.K. users is now completely moot, because effectively the system no longer exists for U.K. users.
At this point I suppose it remains to see what happens next. Perhaps the U.K. government blinks, and relaxes its demands for access to Apple’s keys. In that case, I suppose this story will sink beneath the waves, and we’ll never hear anything about it ever again, at least until next time.
In another world, the U.K. government keeps pushing. If that happens, I imagine we’ll be hearing quite a bit more about this in the future.
Apple already deploys a separate “zone” for many of its iCloud security products in China. This is due to Chinese laws that mandate domestic hosting of Apple server hardware and keys. We have been assured by Apple (in various reporting) that Apple does not violate its end-to-end encryption for the Chinese government. The various people I’d expect to quit — if that claim was not true — all seem to be still working there.
This is the third and penultimate post in a series about theoretical weaknesses in Fiat-Shamir as applied to proof systems. The first post is here, the second post is here, and you should probably read them.
Over the past two posts I’ve given a bit of background on four subjects: (1) interactive proof systems (for general programs/circuits), (2) the Fiat-Shamir heuristic, (3) random oracles. We’ve also talked about (4) recursive proofs, which will be important but are not immediately relevant.
The KRS result deals with a very common situation that we introduced in the previous post. Namely, that many new proving systems are first developed as interactive protocols (usually called public-coin protocols, or sometimes Sigma protocols.) These protocols fit into a template: first, a Prover first sends a commitment message, then interacts with some (honest) Verifier who “challenges” the Prover in various ways that are specific to the protocol; for each challenge, the Prover must respond correctly, either once or multiple times. The nature of the challenges can vary from scheme to scheme. For now we don’t care about the details: the commonality is that in the interactive setting the Verifier picks its challenges honestly, using real randomness.
In many cases, these protocols are then “flattened” down into a non-interactive proof using the Fiat-Shamir heuristic. The neat thing is that the Prover can run the interactive protocol all by itself, by running a deterministic “copy” of the Verifier. Specifically, the Prover will:
Cryptographically hash its own Commitment message — along with some extra information, such as the input/output and “circuit” (or program, or statement.)1
Use the resulting hash digest bits to sample a challenge.
Compute the Prover’s response to the challenge (many times if the protocol calls for this.)
Publish the whole transcript for anyone to verify.
Protocol designers focus on the interactive setting because it’s relatively easy to analyze the security of the protocol. They can make strong claims about the nature of the challenges that will be chosen, and can then reason about the probability that a cheating Prover can lie in response. The hope with Fiat-Shamir is that, if the hash function is “as good as” a random oracle, it should be nearly as secure as the interactive protocol.
Of course none of this applies to practice. When we deploy the protocol onto a blockchain, we don’t have interactive Verifiers and we certainly don’t have random oracles. Instead we replace the random oracle in Fiat-Shamir with a standard hash function like SHA-3. Whether or not any security guarantees still hold once we do this is an open question. And into the maw of that question is where we shall go.
The GKR15 succinct proof system (but not with details)
The new KRS paper starts by considering a specific interactive proof system designed back in 2015 by Goldwasser, Kalai and Rothblum, which we will refer to as GKR15 (note that the “R” is the not the same in both papers — thanks Aditya!) This is a relatively early result on interactive proving systems and has many interesting theoretical features that we will mostly ignore.
At the surface level, what you need to know about the GKR15 proof system is that it works over arithmetic circuits. The Prover and Verifier agree on a circuit C (which can be thought of as the “program”). The Prover also provides someinput to the circuit x as well as a witnessw, and a purported output y, which ought to satisfy y = C(w, x) if the Prover is honest.
There are some restrictions on the nature of the circuits that can be used in GKR15 — which we will ignore for the purposes of this high-level intuition. The important feature is that the circuits can be relativelydeep. That is to say, they can implement reasonably complex programs, such as cryptographic hashing algorithms.
(The authors of the recent KRS paper also refer to GKR15 as a “popular” scheme. I don’t know how to evaluate that claim. But they cite some features in the protocol that are re-used in more recent proof systems that might be deployed in practice, so let’s go with this!)
The GKR15 scheme (like most proving schemes) is designed to be interactive. All you need to know for the moment is that:
The Prover and Verifier agree on C.
The Prover sends the input and output (x, y) to the Verifier.
It also sends a (polynomial) commitment to the witness w in the first “Commitment” message.
The Verifier picks a random challenge c and the Prover/Verifier interact (multiple times) to do stuff in response to this challenge.
Finally: GKR15 possesses a security analysis (“soundness proof”) that is quite powerful, provided we are discussing the interactive version of the protocol. This argument does not claim GKR15 is “perfect”! It leaves room for some tiny probability that a cheating Prover can get away with its crimes: however, it does bound the probability of successful cheating to something (asymptotically and, presumably, practically with the right parameters) negligible.
Since the GKR15 authors are careful theoretical cryptographers, they don’t suggest that their protocol would be a good fit for Fiat-Shamir. In fact, they hint that this is a problematic idea. But, in the 2015 paper anyway, the don’t actually show there’s a problem with the idea. This leaves open a question: what happens if we do flatten it using Fiat-Shamir? Could bad things happen?
A first thought experiment: “weak challenges”
I am making a deliberate decision to stay “high level” and not dive into the details of the GKR15 system quite yet, not because I think they’re boring and off-putting — ok, it’s partly because I think they’re boring and off-putting — but mostly because I think leading with those details will make understanding more difficult. I promise I’ll get to the details later.
Up here at the abstract level I want to remind us, one final time, what we know. We know that proving systems like GKR15 involve a Verifier making a “challenge” to the Prover, which the Prover must answer successfully. A good proving system should ensure that an honest Prover can answer those challenges successfully — but a cheating Prover (any algorithm that is trying to prove a false statement) will fail to respond correctly to these challenges, at least with overwhelming probability.
Now I want to add a new wrinkle.
Let’s first imagine that the set of all possible challenges a Verifier could issue to a Prover is quite large. For one example: the the challenge might be a random 256-bit string, i.e., there are 2256 to choose from. For fun, let’s further imagine that somewhere within this huge set of possible challenge values there exists a single “weak challenge.” This value c* — which I’ll exemplify by the number “53” for this discussion — is special. A cheating Prover who is fortunate enough to be challenged at this point will always be able to come up with a valid response, even if the statement they are “proving” is totally false (that is, if y is not equal to C(w, x).)
Now it goes without saying that having such a “weak challenge” in your scheme is not great! Clearly we don’t like to have weird vulnerabilities in our schemes. And yet, perhaps it’s not really a problem? To decide, we need to ask: how likely is it that a Verifier will select this weak challenge value?
If we are thinking about the interactive setting with an honest Verifier, the analysis is easy. There are 2256 possible challenge values, and just one “weak challenge” at c* = 53. If the honest Verifier picks challenges uniformly at random, this bad challenge will be chosen with probability 2-256 during one run of the protocol. This is so astronomically improbable that we can more or less pretend it will never happen.
How does Fiat-Shamir handle “bad challenges”?
We now need to think about what happens when we flatten this scheme using Fiat-Shamir.
To do that we need to be more specific about how the scheme will be Fiat-Shamir-ified. In our scheme, we assumed that the Prover will generate a Commitment message first. The deterministic Fiat-Shamir “Verifier” will hash this message using a hash function H, probably tossing in some other inputs just to be safe. For example, it might include the input/output values as well as a hash of the circuit h(C) — note that we’re using a separate hash function out of an abundance of caution. Our challenge will now be computed as follows:
c = H( h(C), x, y, Commitment )
Our strawman protocol has a weak challenge point at c* = 53. So now we should ask: can a cheating Prover somehow convince the Fiat-Shamir hash “Verifier” to challenge them at this weak point?
The good news for Fiat-Shamir is that this also seems pretty difficult. A cheating Prover might want to get itself challenged on c* = 53. But to do this, they’d need to find some input (pre-image) to the hash function H that produces the output “53”. For any hash function worth its salt (pun intended!) this should be pretty difficult!4
Of course, in Fiat-Shamir world, the cheating Prover has a slight advantage: if it doesn’t get the hash challenge it wants, it can always throw away the result and try again. To do this it could formulate a new Commitment message (or even pick a new input/output pair x, y) and then try hashing again. It can potentially repeat this process millions of times! This attack is sometimes called “grinding” and it’s a real attack that actual cheating Provers can run in the real world. Fortunately even grinding isn’t necessarily a disaster: if we assume that the Prover can only compute, say, 250 hashes, then (in the ROM) the probability of finding an input that produces c = 53 is still 250 * 2-256 = 2-206, yet another very unlikely outcome.
Another victory for Fiat-Shamir! Even if we have one or more fixed “weak challenge” points in our protocol, it seems like Fiat-Shamir protects us.
What if a cheating Prover can pick the “weak challenge”?
I realize I’m being exhaustive (or exhausting), but I now want to consider another weird possibility: what if the “bad challenge” value can change when we switch circuits (or even circuit inputs)?
For fun, let’s crank this concern into absolute overdrive. Imagine we’re using a proving system that’s as helpful as possible to the cheating Prover: in this system, the “weak challenge value” will be equal to whatever the real output of the circuit C is. Concretely, if the circuit outputs c* = C(w, x) and alsoc* happens to be challenge value selected by the Verifier, then the Prover can cheat.
(Notice that I’m specifying that this vulnerability depends on the “real” output of the circuit. The prover also sends over a value y, which is the purported output of the circuit that the Prover is claiming, but that might be different if the Prover is cheating.)
In the interactive world this really isn’t an issue. In that setting, the Verifier selects the challenge randomly after the Prover has committed to everything. In that setting, the Prover genuinely cannot “cook” the circuit to produce the challenge value as output (except with a tiny probability.) Our concern is that In Fiat-Shamir world we should be a bit more worried. The value c* is chosen by hashing. A cheating Prover could theoretically compute c* in advance. It now has several options:
The cheater could alter the circuit C so that it hard-codes in c*.
Alternatively, it could feed the value c* (or some function of it) into the circuit via the input x.
It could sneak c* in within the witness w.
Or any combination of the above.
The problem here is that even in Fiat-Shamir world, this attack doesn’t work. A cheating Prover might first pick a circuit C and some input/output x, witness w, and a (purported) output y. It could then compute a Commitment by using the proving scheme’s commitment protocol to commit to w. It will then compute the anticipated Fiat-Shamir challenge as:
c* = H( h(C), x, y, Commitment )
To exploit the vulnerability in our proving system, the cheating Prover must now somehow cause the circuit C to output C(w, x) = c*. However, once again we encounter a problem. The cheating Prover had to feed x, as well as a (commitment to) w and (a hash of) the circuit C into the Fiat-Shamir hash function in order to learn c*. (It should not have been able to predict c* before performing the hash computation above, so it’s extremely unlikely that C(w, x) would just happen to output c*.) However, in order to make C(w, x) output c*, the cheater must subsequently manipulate either (1) the design of C or (2) the circuit’s inputs w, x.
But if the cheating Prover does any of those things, they will be completely foiled. If the cheating Prover changes any of the inputs to the circuit or the Circuit itself after it hashes to learn c*, then the actual Fiat-Shamir challenge used in the protocol will change.3
What if the circuit can compute the “weak challenge” value?
Phew. So what have we learned?
Even if our proving system allows us to literally pick the “weak challenge” c* — have it be the output of the circuit C(w, x) — we cannot exploit this fact very easily. Changing the structure of C, x or wafter we’ve learned c*will cause the actual Fiat-Shamir challenge used to simply hop to a different value, like the timeline of a traveler who goes back in time and kills their own grandfather.
Clearly we need a different strategy: one where none of the inputs to the circuit, or the “code” of the circuit itself, depend on the output of the Fiat-Shamir hash function.
An important observation, and one that is critical to the KRS result, is that the “circuit” we are proving is fundamentally a kind of program that computes things. What if, instead of computing c* and feeding it to the circuit, or hard-coding it within the circuit, we instead design a circuit that itself computes c*?
Let’s look one last time at the way that Fiat-Shamir challenges are computed:
c* = H( h(C), x, y, Commitment)
We are now going to build a circuit C* that is able to compute c* when given the right inputs.
To make this interesting, we are going to design a very silly circuit. It is silly in a specific way: it can never output one specific output, which in this case will be the binary string “BANANAS”. Looking forward, our cheating Prover is then going to try to falsely claim that for some x, w, the relation C*(w, x) = “BANANAS” actually does hold.
Critically, this circuit will contain a copy of the Fiat-Shamir hashing algorithm H() and it will — quite coincidentally — actually output a value that happens to be identical to the Fiat-Shamir challenge (most of the time.) Here’s how it works:
The cheating Prover will pass w and x = ( h(C*), y* = “BANANAS” ) as input.
It will then use the proving system’s commitment algorithm on w to compute Commitment.
The circuit will parse x into its constituent values.
The circuit will internally compute c* = H( h(C*), x, y*, Commitment ).
In the (highly unlikely) case that c* = “BANANAS”, it will set c* = “APPLES”.
Finally, the circuit will output c*.
Notice first that this circuit can never, ever output C*(w, x) = “BANANAS” for any inputs w, x. Indeed, it is incredibly likely that this output would happen to occur naturally after step (4) — how likely is it that a hash function outputs the name of a fruit? — but just to be certain, we have step (5) to reduce this occurrence from “incredibly unlikely” to “absolutely impossible.”
Second, we should observe that the “code” of the circuit above does not in any way depend on c*. We can write this circuit before we ever hash to learn c*! The same goes for the inputs w and x. None of the proposed inputs require the cheating Prover to know c* before it chooses them.
Now imagine that some Prover shows up and tells you that, in fact, they know some w, x such that C*(w, x) = “BANANAS”. By definition they must be lying! When they try to convince you of this using the proving system, a correctly-functioning system should (with overwhelming probability) catch them in their lie and inform you (the Verifier) that the proof is invalid.
However: we stipulated that this particular proving system becomes totally insecure in the special case where the real calculation C*(w, x) happens to equal the Fiat-Shamir challenge c* that will be chosen during the protocol execution. If and when this remarkable event ever happens, the cheating Prover will be able to successfully complete the protocol even if the statement they are proving is totally bogus. And finally it seems we have an exploitable vulnerability! A cheating Prover can show up with the ridiculous false statement that C*(w, x) = “BANANAS”, and then hand us a Fiat-Shamir proof that will verify just fine. They can do this because in real life, C*(w, x) = c* and, thanks to the “weak challenge” feature of the proving system, that Prover successfully answer all the challenges on that point.
And thus in Fiat-Shamir landwe have finally proved a false statement! Note that in the interactive protocol none of this matters, since the Prover cannot predict the Verifier’s (honest) challenge, and so has no real leverage to cheat. In the Fiat-Shamir world — at least when considering this one weird circuit C* — a cheating Prover can get away with murder.
This is not terribly satisfying!
Well, not murder maybe.
I realize this may seem a bit contrived. First we had to introduce a made-up proving system with a “weird” vulnerability, then we exploited it in a very nutty way. And the way we exploited it is kind of silly. But there is a method to the madness. I hope I’ve loaded us up with the basic concepts that we will need to really understand the KRS result.
What remains is to show how the real GKR15 scheme actually maps to some of these ideas.
As a second matter: I started these posts out by proposing all kinds of flashy ideas: we could steal billions of dollars in Ethereum transactions by falsely “proving” that a set of invalid transactions is actually valid! Now having promised the moon, I’m delivering the incredibly lame false statement C*(w, x) = “BANANAS”, where C* is a silly circuit that mostly computes a hash function. This should disappoint you. It disappoints me.
But keep in mind that cryptographic protocols are very delicate. Sometimes an apparently useless vulnerability can be expanded into something that actually matters. It turns out that something similar will apply to this vulnerability, when we apply it to the GKR15 scheme. In the next post.
There will be only one last post, I promise.
Notes:
There is a class of attacks on Fiat-Shamir-ified protocols that stems from putting too little information into the hash function. Usually the weakness here is that the hash does not get fed stuff like “the public key” (in signature schemes), which weakly corresponds to “the circuit” in a proving system. Sneaky attackers can switch these out and do bad thing.
I’ve been extremely casual about the problem of converting hash function outputs into “challenges” for arbitrary protocols. Sometimes this is easy — for example, if the challenge space consists of fixed-length random bit-strings, then we can just pick a hash function with the appropriate output length. Sometimes it is moderately easy, for example if the challenges are sampled from some reasonably-sized field. A few protocols might pick the challenges from genuinely weird distibutions. If you’re annoyed with me on this technical detail, I might stipulate that we have (A) a Fiat-Shamir hash function that outputs (enough) random bits, and (B) a way to convert those bits into whatever form of challenge you need for the protocol.
I said this is not scientific, but clearly we could make some kind of argument in the random oracle model. There we’d argue (something like): assume the scheme is secure using random challenges chosen by an honest (interactive) Verifier, now let’s consider whether it’s possible for the attacker to have pre-queried the oracle on the inputs that will produce the actual Fiat-Shamir. We could then analyze every possible case and hopefully determine that the answer is “no.” And that would be “science.”
In the random oracle model we can convert this intuition into a stronger argument: each time the Prover hashes something for the first time, the oracle effectively samples a uniformly random string as its response. Since there are 2256 possible digests, the probability that it returns c = 53 after one hash attempt should still be 2-256, which again is very low.
I’m supposed to be finishing a wonky series on proof systems (here and here) and I promise I will do that this week. In the midst of this I’ve been a bit distracted by world events.
Last week the Washington Post published a bombshell story announcing that the U.K. had filed “technical capability notices” demanding that Apple modify the encryption used in their Advanced Data Protection (ADP) system for iCloud. For those who don’t know about ADP, it’s a system that enables full “end-to-end” encryption for iCloud backups, including photos and other data. This means that your backup data should be encrypted under your phone’s passcode — and critically, neither Apple nor hackers can access it. The U.K. request would secretly weaken that feature for at least some users.
Along with Alex Stamos, I wrote a short opinion piece (version without paywall) at the Wall Street Journal and I wanted to elaborate on the news a bit more here.
What’s iCloud and what’s ADP?
Most Apple phones and tablets are configured to automatically back their contents up to a service called iCloud Backup, which maintains a nearly-mirror copy of every file on your phone. This includes your private notes, contacts, private iMessage conversations, personal photos, and so on. So far I doubt I’m saying anything too surprising to the typical Apple user.
What many people don’t know is that in normal operation, this backup is not end-to-end encrypted. That is, Apple is given a decryption key that can access all your data. This is good in some ways — if you lose your phone and also forget your password, Apple might still be able to help you access your data. But it also creates a huge weakness. Two different types of “bad guys” can walk through the hole created by this vulnerability: one type includes hackers and criminals, including sophisticated state-sponsored cyber-intrusion groups. The other is national governments: typically, law enforcement and national intelligence agencies.
Since Apple’s servers hold the decryption key, the company can be asked (or their servers can be hacked) to provide a complete backup copy of your phone at any moment. Notably, since this all happens on the server side, you’ll never even know it happened. Every night your phone sends up a copy of its contents, and then you just have to hope that nobody else obtains them.
This is a bad situation, and Apple has been somewhat slow to remedy it. This is surprising, since Google has enabled end-to-end encrypted backup since 2018. Usually Apple is the company leading the way on privacy and security, but in this case they’re trailing? Why?
In 2021, Apple proposed a byzantine new system for performing client-side scanning of iCloud Photos, in order to detect child sexual abuse material (CSAM). Technical experts pointed out that this was a bizarre architecture, since client-side scanning is something you do when you can’t scan photos on the server — usually because that data is encrypted. However at that time Apple refused to countenance the notion that they were considering end-to-end encryption for backup data.
Then, at the end of 2022, Apple finally dropped the other shoe. The new feature they were deploying was called Advanced Data Protection (ADP), and it would finally enable end-to-end encryption for iCloud Backup and iCloud Photos. This was an opt-in mode and so you’d have to turn it on manually. But if you did this, your backups would be encrypted securely under your phone’s passcode — something you should remember because you have to type it in every day — and even Apple would not be able to access them.
The FBI found this very upsetting. But, in a country with a Fourth and First Amendment, at least in principle, there wasn’t much they could do if a U.S. firm wanted to deploy software that enabled users to encrypt their own data.
Go into “Settings”, type “Advanced data” and turn it on.
But… what about other countries?
Apple operates in hundreds of different countries, and not all of them have laws similar to the United States. I’ve written before about Apple’s stance in China — which, surprisingly, does not appear to involve any encryption backdoors. But of course, this story involves countries that are closer to the US, both geographically and politically.
In 2016, the U.K. passed the Investigatory Powers Act (IPA), sometimes known as the “Snooper’s Charter.” The IPA includes a clause that allows His Majesty’s government to submit Technical Capabiilty Notices to technology firms like Apple. A Technical Capability Notice (TCN) under U.K. law is a secret request in which the government demands that a technical provider quietly modify the operation of its systems so that they no longer provide the security feature advertised to users. In this case, presumably, that would involve weakening the end-to-end encryption system built into iCloud/ADP so that the U.K. could request downloads of encrypted user backups even without access to the user’s passcode or device.
The secrecy implicit in the TCN process is a massive problem here, since it effectively requires that Apple lie to its users. To comply with U.K. law, they must swear that a product is safe and works one way — and this lying must be directed to both civilian users and U.S. government users, commercial users, and so on — while Apple is forced to actively re-design their products to work differently. The dangers here should be obvious, along with the enormous risks to Apple’s reputation as a trustworthy provider. I will reiterate that this is not something that even China has demanded of Apple, as far as we know, so it is quite alarming.
The second risk here is that the U.K. law does not obviously limit these requests to U.K. customers. In a filing that Apple submitted back in 2024, the company’s lawyers make this point explicitly:
And when you think about it — this part I am admittedly speculating about — it seems really surprising that the U.K. would make these requests of a U.S. company without at least speaking to their counterparts in the United States. After all, the U.K. and the U.S. are part of an intelligence alliance known as Five Eyes. They work together on this stuff! There are at least two possibilities here:
The U.K. is operating alone in a manner that poses serious cybersecurity and business risks to U.S. companies.
The U.S. and the U.K. intelligence (and perhaps some law enforcement agencies) have discussed the request, and both see significant benefit from the U.K. possessing this capability.
We can’t really know what’s going on here, but both options should make us uncomfortable. The first implies that the U.K. is going rogue and possibly harming U.S. security and business, while the latter implies that some level of U.S. agencies are at tacitly signing off on a capability that could be used illegally against U.S. users.
What we do know from the recent Post article is that Apple was allegedly so uncomfortable with the recent U.K. request that they are “likely to stop offering encrypted storage in the U.K.“, i.e., they were at least going to turn off Advanced Data Protection in the U.K. This might or might not have resolved the controversy with the U.K. government, but it at least it indicated that Apple is not going to quietly entertain these requests.
What about other countries?
There are about 68 million people in the U.K., which is not a small number. But compared to other markets Apple sells in, it’s not a big place.
In the past, U.S. firms like WhatsApp and Signal have in the past made plausible threats to exit the U.K. market if the U.K. government makes good on threats to demand encryption backdoors. I have no doubt that Apple is willing to go to the mat for this as well if they’re forced to — as long as we’re only talking about the U.K. This is really sad for U.K. users, who deserve to have nice things and secure devices.
But there are bigger markets than the U.K. The European Union has 449.2 million customers and has been debating laws that would demand some access to encrypted messaging. China has somewhere between two to three times that. These are big markets to risk over encryption! Moreover, Apple builds a lot of its phones (and phone parts) in China. While I’m an optimist about human ethics — even within big companies — I doubt that Apple can convince its shareholders that their relationship with China is worth risking, over something abstract like the value of trust, or over end-to-end encrypted messages or iCloud.
And that’s what’s at stake if Apple gives in to the U.K. demands. If Apple gives in here, there’s zero reason for China not to ask for the same thing, perhaps this time applied to Apple’s popular iMessage service. And honestly, they’re not wrong? Agreeing to the U.K.’s request might allow the U.K. and Five Eyes to do things that would harm China’s own users. In short, abandoning Apple’s principles one place means they ultimately have to give in anywhere (or worse), or — and this is a realistic alternative — Apple is forced to leave many parts of the world. Both are bad for the United States, and both are bad for people in all countries.
So what should we do legally?
If you read the editorial, it has a simple recommendation. The U.S. should pass laws that forbid U.S. companies from installing encryption backdoors at the request of foreign countries. This would put companies like Apple in a bind. But it would be a good bind! To satisfy the laws of one nation, Apple would have to break the laws of their home country. This creates a “conflict of laws” situation where, at very least, simple, quiet compliance against the interest of U.S. citizens and customers is no longer an option for Apple — even if the shareholders might theoretically prefer it.
I hope this is a policy that many people could agree on, regardless of where they stand politically.
So what should we do technically?
I am going to make one more point here that can’t fit in an editorial, but deserves to be said anyway. We wouldn’t be in this jam if Apple had sucked it up and deployed end-to-end encrypted backup more broadly, and much earlier in the game.
Over the past decade I’ve watched various governments make a strong push to stop device encryption, add “warrant” capability to end-to-end encrypted messaging, and then install scanning systems to monitor for illicit content. Nearly all of these attempts failed. The biggest contributor to the failure was widespread deployment and adoption of encryption.
Once a system is widely deployed and people realize it’s adding value and safety to a system, they are loath to mess with that system. You see this pattern first with on-device encryption, and then with messaging. A technology is at first controversial, at first untenable, and then suddenly it’s mandatory for digital security. Even law enforcement agencies eventually start begging people to turn it on:
A key ingredient for this transition to occur is that lots of people must be leaning on that technology. If 1-2% of the customer base uses optional iCloud encryption, then it’s easy to turn the feature off. Annoying for a small subset of the population, maybe, but probably politically viable for governments to risk it. The same thing is less true at 25% adoption, and it is not true at 50% or 100% adoption.
Apple built the technology to deploy iCloud end-to-end encryption way back in 2016. They then fiddled around, not deploying it even as an option, for more than six years. Finally at the end of 2022 they allowed people to opt-in to Advanced Data Protection. But even then they didn’t make it a default, they don’t ask you if you want to turn it on during setup. They barely even advertise it to anyone.
If someone built an encryption feature but nobody heard about it, would it still exist?
This news from the U.K. is a wake up call to Apple that they need to move more quickly. They may feel intimidated due to blowback from Five Eyes nations, and that might be driving their reticence. But it’s too late, the cat is out of the bag. People are noticing their failure to turn this feature on and — while I’m certain there are excellent reasons for them to go slow — the silence and slow-rolling is starting to look like weakness, or even collaboration with foreign governments.
So what I would like, as a technologist, is for Apple to act serious about this technology. It’s important, and the ball is very much in Apple’s court to start pushing those numbers up. The world is not a safe place, and it’s not getting safer. Apple should do the right thing here because they want to. But if not, they should do it because doing otherwise is too much of a liability.
This is the second part of a twothree four-part series, which covers some recent results on “verifiable computation” and possible pitfalls that could occur there. This post won’t make much sense on its own, so I urge you to start with the first part.
In the previous post we introduced a handful of concepts, including (1) the notion of “verifiable computation” proof systems (sometimes inaccurately called “ZK” by the Ethereum community), (2) hash functions, and (3) some ideal models that we use for our security proofs, and (4) the idea that these “ideal models” are bogus — and sometimes they can make us confident in schemes that are totally insecure in the real world.
Today I want to move forward and (get closer) to actually talking about the recent result alluded to in the title: the recent paper by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir” (henceforth: KRS.) This paper shows that a proving scheme that appears to be secure in one setting, might not actually be secure.
One approach to discussing this paper would be to start at the beginning of the paper and then move towards the end. We will not do that. Instead, I plan to pursue an approach that involves a lot of jumping around. There is a method to my madness.
Before we can get there, we need to cover a bit more essential background.
Background Part One: Interactive proof systems
I have introduced these posts by reiterating a critique of something called the random oracle model paradigm, in which we pretend that hash functions are actually random functions. Thoughtful cryptographers will no doubt be upset with me about this, since in fact the KRS paper is not about random oracles at all! Instead it demonstrates a problem with a different “heuristic” that cryptographers use everywhere: this is called the Fiat-Shamir heuristic.
While Fiat-Shamir is not the same as the random oracle model, the two live in the same neighborhood and send their kids to the same school. What I mean is: Fiat-Shamir can (in some very limited theoretical senses) live without the random oracle model, but in practice the two are usually interdependent.
To explain this new result I therefore need to explain what Fiat-Shamir does. And before I can do that, I need to explain what interactive proofs are. (Feel free to skip forward if you already know this part.)
Many of the verifiable computation “proof systems” we use today are members of a class of protocols called interactive proofs. These are protocols in which two parties, a Prover and a Verifier, exchange messages so that the Prover can convince a Verifier of the truth of a given statement (such as “I know an input x that makes this particular program happy.”, and maybe a witness w to help me prove that) In many cases, these protocols obey a pattern of interaction that takes the following form:
The Prover sends a message that “commits” to the input and witness, and maybe some other things. This commitment message is sent to the Verifier.
The Verifier then generates one or more challenges that the Prover must respond to. The exact nature of what happens here can change from scheme to scheme.
The Prover then computes responses to each of the challenges, and the Verifier checks that each response is valid (again, in a manner that is highly specific to the proving scheme.) It rejects the proof if any of the responses don’t check out.
The pair may repeat the above steps many times — either sequentially or in parallel.
Yes I know that I’m being incredibly vague about what’s happening with these challenges and responses! The truth is that, for the moment, we don’t care. All you need to know is that the challenge/response bits should be easy for the Prover to respond to if it is being honest, that is — that is, if the witness (input) really satisfies the program. It should be unlikely that the Prover can correctly respond to a random challenge if it’s cheating, i.e., if it does not have a proper witness.
(Note that we don’t demand that the challenges be impossible for a cheating Prover to sneak past! This is why proving systems often repeat the challenge/response phase many times: even if there’s a small chance that a cheating Prover could cheat their way through one challenge, we’d argue that they have a much lower probability of cheating many times.)
What you may notice about this entire setup is that (1) interactive proofs require lots of (duh) interaction. What might not be so obvious is that (2) they assume an honest Verifier who formulates “good” challenges.
This need for interaction is pretty annoying in many applications. It is particularly aggravating for systems like blockchains, where there can be thousands (or millions) of computers who will all need to verify that a given statement (say, a transaction) is correct. It would be much, much easier if the Prover could run the proof just once time with a single Verifier, then the pair could just publish the transcript of their interaction. Anyone could just check the transcript to make sure the Prover answered all the challenges correctly!
Unfortunately, there is a critical problem with that idea! The security of these protocols rides on the idea that the Verifier’s challenges are random, or at least highly unpredictable to the Prover. If the Prover can somehow anticipate which values it will be challenged on before it commits to its inputs in step (1), it can often cheat by altering its approach in the first message. To be more concrete: a dishonest Verifier can “collude” with the Prover to help it prove a false statement, by sneakily letting it know the challenges in advance. For this reason it is: critically important that the Verifier must be honest, and not colluding with the Prover.
But the whole point of these systems is that we shouldn’t need to trust individual parties at all! If we’re just going to trust that people are behaving honestly, what’s the point of any of this?
More background: Fiat-Shamir
Now I want to take you way back in time. All the way back to the mid-1980s.
Back in 1986, two cryptographers named Amos Fiat and Adi Shamir (pictured above) were stuck on a problem very much like this one. They had an interactive proof system — a much simpler one, since it was the 1980s after all — and they wanted to turn their interactive proofs into non-interactive proofs that any party could verify. They thought about the transcript idea described above, and they realized it wouldn’t work — a Verifier could simply collude with the Prover to help it cheat. To address this, they came up with an ingenious solution that was elegant, simple, and alsowould open up a yawning chasm of theory that we are still trying to dig out of today.
Fiat turned to Shamir (I imagine) and outlined the overall problem. Fiat (or Shamir) said: “Perhaps we could find a way for a Verifier to select the challenges in some random but reproducible way — one that would allow anyone to ensure that the challenges were actually random and unpredictable.” And then one of them said: that sounds a lot like a hash function.
Instead of choosing the challenge values at random, Fiat and Shamir proposed that the “Verifier” would select the challenge values by hashing the “commitment” message sent by the Prover, perhaps along with other junk (such as the “program” or circuit being proved.) The Prover would then respond to these challenge messages, and output a transcript of the whole proof.
And that’s it. That’s the entire trick.
Despite its simplicity, there are some obvious attractive features to this Fiat-Shamir approach:
Good hash functions typically output stuff that looks pretty “random”, which is what we want for challenges.
Any third party can easily check a transcript, simply by verifying that the challenge values match the hash of the Prover’s “commitment” message. (In other words, there’s no more room for the Verifier to collude or cheat, since it is now fully deterministic.
Critically, there is a cool “circular” paradox in here. A cheating Prover might try the following trick to predict the values it will be challenged on. Specifically, it might (1) pick a commitment message and then (2) hash that message to find the challenges. Once it knows the challenge values, it might try to change its inputs to step (1) so it can more easily cheat on those specific challenge points. But critically that approach creates a paradox…! if the Prover changes its inputs to step (1), that will result in a whole new “commitment” message! Once hashed, that new commitment message will produce a very different set of challenge messages, and our cheater is locked in an infinite time-loop that it can never escape!1
The great thing about Fiat-Shamir is that once your (challenge-generating) Verifier is fully deterministic, there’s no more reason to even have that code run by a separate party. The Prover can run the deterministic challenge-generation code all by itself, i.e., performing all necessary hashing to make the challenges, and then outputting the final transcript. So the Prover and (original) “Verifier” code collapse into a single party (that we will now just call the Prover), and the newVerifier is an algorithm that checks the transcript — performing all the necessary hashes and challenge/response checks to make sure everything is kosher.
The resulting proofs (“transcripts”) do not require any interaction to verify, and so we can even post them on blockchains. They can be verified by thousands or millions of people, and we are now set to hang big piles of money off of them.
Starknet is just one of the cryptocurrency systems hanging real money off of Fiat-Shamir-style proof systems. There are others!
I bet you’re going to yammer on about the provable security of Fiat-Shamir now, right?
Wait, how did you know that’s what I was going to talk about? Oh that’s right: “you” are me, and so I’m just answering my own questions. (Wasn’t that a cute illustration of the paradox that Fiat-Shamir helps to solve!)
I am going to make this as quick and painless as I can, but here’s the deal. Fiat-Shamir seems like a nutty trick. We even call it a heuristic, which is literally an admission of this. And yet. Literally hundreds of papers have been written about the provable security of Fiat-Shamir and schemes that use it.
The general TL;DR is that Fiat-Shamir can often be proven secure (for various definitions of “secure”) if we make one helpful assumption. Specifically: that the hash function we use is actually a random oracle (please see this footnote for more pedantic stuff!2) I’m not going to get very deep into the argument, but I just want you to remember how random oracles work:
In the random oracle model, the hash function is a random function. Phrased imprecisely, this means that (when queried on some fresh value) it outputs random bits that are completely uncorrelated with the input.
The hash function “lives” inside a totally separate party called an oracle. You send things to be hashed, if the input has not been hashed before, you get back unpredictable random values.
This clearly looks a lot like the interactive proof setting! Put succinctly (no pun): if an appropriate scheme can be proven secure in an interactive setting where the Prover interacts with an honest Verifier (who picks random challenges), then it seems likely that the Fiat-Shamir version of that protocol should also work with a random oracle. The random oracle is essentially acting like the Verifier in the original interactive scheme: it is generating random challenges that everyone can “trust” to be truly random, and yet any third party can also ask it to reproduce the same challenges later on, when they want to check a transcript!
And for many purposes, this random oracle approach usually works ok. Some folks have come up with crazy theoretical counterexamples (meaning, contrived interactive protocols that are secure in the random oracle model, yet blow apart when used with real hash functions.) But mostly practitioners just ignore these because they’re so obviously full of weird nonsense.
Out in the real world where applied cryptographers design new proving systems on a daily basis, we’ve adopted a pretty standard pattern. A new proof system will be specified as an interactive protocol first. Ultimately everyone knows this proof system won’t be used interactively, it will be Fiat-Shamir flattened and used on a blockchain. Yet the authors won’t spend a lot of time arguing about the Fiat-Shamir part. They’ll simply describe an interactive protocol with the right structure, then they say something like “of course this can be flattened using Fiat-Shamir, if we assume a random oracle or something” and everyone nods and deposits a billion dollars onto it.
But there’s a catch, isn’t there?
Indeed, there is a major asterisk (*) about this whole strategy that I must now raise.
Even though we can sometimes prove Fiat-Shamir protocols secure, usually in the ROM, a critical feature of these ROM proofs is that we (the participants in the protocol) do not know a compact description of the hash function. This is inevitable, since the hash function used in the random oracle model is a giant random function that cannot possibly expressed in a compact form.
In the real world we will naturally replace the random oracle with something like SHA-3 or an even more exciting hash function like Poseidon. Suddenly, everyone in the protocol will know a compact description of the hash function. As I mentioned above, this can lead to theoretical problems. Way back in 2004, Goldwasser and Tauman (now Kalai) designed a specific interactive protocol that exploded when the hash was instantiated with any concrete hash function.
But the Goldwasser/Tauman protocol was very artificial. It did silly things you could see in the protocol description. So obviously as long as we don’t do those things, we were fine, maybe?
The problem now is that we are deploying proof systems that can prove the satisfaction of literally any reasonable program (or “NP-relation”.) These programs might contain an implementation of the Fiat-Shamir hash function. In the random oracle model, this is literally impossible — so we just assume it cannot happen. In the real world it’s eminently possible, and we kind of have to assume it can and will happen.
In fact it is extremely likely that some circuits really will contain an implementation of the Fiat-Shamir hash function! The reason is because of those recursive proofs I mentioned in the previous post.
Let’s say we want to build a recursive proof system that works to verify one of our flattened Fiat-Shamir proofs. Recall that to do this, we have to take the Verify algorithm that checks a Fiat-Shamir transcript, and implement it within a program (or circuit.) We then need to run that program and generate a proof that we ran that program successfully! And to make all this work, we really do need to include a copy of the Fiat-Shamir hash function inside our programs — this is not optional at all.
The crazy thing is that we can’t even prove these recursive Fiat-Shamir-based proofs secure in the random oracle model! In the random oracle model there is no compact description of the hash function, and so no there is no compact recursive Verify program/circuit that we could write. Recursion of this sort is totally impossible. Indeed, recursive Fiat-Shamir proofs can only exist outside of the random oracle model, where we use something like SHA-3 to implement the hash function. But of course, outside of the ROM we can’t prove anything about their security. As a result: anytime you see a recursive Fiat-Shamir proof we’re just basically tossing provable security out the window and full-on YOLOing it.
This situation is very bad and many theoreticians have died (inside) thinking about it.
I have now written an entire second post and I have not yet gotten to the KRS result I came here to talk about! Is anyone still reading? Is this thing still on? I sure hope so.
We are now ready to talk about KRS, and I am going to do that immediately in the next post. Before I close this post and get ready for the bigone I will tackle next, allow me recap where we are.
We know that Fiat-Shamir can be proven secure, but generally (for full-on SNARKs) only in the random oracle model.2
Once we actually instantiate Fiat-Shamir with real hash functions, any weird thing could happen: especially if the same hash function is implemented within the programs/circuits we want to prove things about.
Recursive (Fiat-Shamir) proofs actually require us to implement the hash function inside of the programs we’re going to prove things about, so that’s ultra-worrying.
What remains, however, is to demonstrate that Fiat-Shamir can actually be insecure in practice. Or more concretely: that there exist “evil” programs/circuits that can somehow break a perfectly good proof system that uses Fiat-Shamir.
The Fiat-Shamir technique isn’t immune to a few obvious attacks, of course. For example: a cheating Prover (who is typically also the “Verifier”) can “grind” the proof — by trying many different inputs to the first message and then, for each one, testing the resulting challenges to see if they’re amenable to cheating. If there is a small probability of cheating, this “try the game many times” approach can significantly boost a cheater’s probability of getting lucky at cheating on a challenge/response, since they now have millions (or billions!) of attempts to find a lucky challenge.
However, a realistic assumption here is that real-world cheating Provers only have so much computing power. Even if a Prover can try a huge number of hashing attempts (say 250) you can easily set up your scheme so that the probability they succeed is still arbitrarily small. Not everyone does this perfectly, of course: my PhD student Pratyush recently co-authored a nice paper about the parameter choices made by some real-world blockchain Proving systems.
When I say that the provable security of Fiat-Shamir depends on the random oracle model, I am being slightly imprecise. The random oracle model is usually sufficient to prove claims about Fiat-Shamir. But in fact there are some (relatively) recent results that show how to construct Fiat-Shamir for very specific interactive protocols using hash functions that are not random oracles: these are called correlation intractable functions. To the best of my knowledge, it is not possible to prove Fiat-Shamir-based SNARKs that work with arbitrary (adaptively-chosen) programs/circuits using these functions. But I am open to being wrong on this detail.
Trigger warning: incredibly wonky theoretical cryptography post (written by a non-theorist)!Also, this will be in two parts. I plan to be back with some more thoughts on practical stuff, like cloud backup, in the near future.
If you’ve read my blog over the years, you should understand that I have basically two obsessions. One is my interest in building “practical” schemes that solve real problems that come up in the real world. The other is a weird fixation on the theoretical models that underpin (the security of) many of those same schemes. In particular, one of my favorite bugaboos is a particular model, or “heuristic”, called the random oracle model (ROM) — essentially a fancy way to think about hash functions.
Along those lines, my interest was recently piqued by a new theoretical result by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir.” This is a doozy of a paper! It touches nearly every sensitive part of my brain: it urges us towards a better understanding of our theoretical models for proving security of protocols. It includes the words “practical” and “attacks” in the title! And most importantly it demonstrates a real (albeit wildly contrived) attack on the kinds of “ZK” (note: not actually ZK, more on that later) “proving systems” that we are now using inside of real systems like blockchains.
I confess I am still struggling hard to figure out how I “feel” about this result. I understand how odd it seems that my feelings should even matter: this is science after all. Shouldn’t the math speak for itself? The worrying thing is that, in this case, I don’t think it does. In fact, this is what I find most fundamentally exciting about the result: it really does matter how we think about it. (Here I should add that we don’t all think the same say. My theory-focused PhD student Aditya Hegde has been vigorously debating me on my interpretation — and mostly winning on points. So anything non-stupid I say here is probably due to him.)
I mentioned that this post is going to be long and wonky, that’s just unavoidable. But I promise it will be fun. (Ok, I can’t even promise that.) Screw it, let’s go.
The shortest background ever (and it will still be really long)
If you’ve read this blog over the long term, you know that I’m obsessed with one particular “trick” we use in proving our schemes secure. This trick is known as the random oracle model, and it’s one of the worst (or best) things to happen to cryptography.
Let me try to break this down as quickly as I can. In cryptography we have a tendency to use an ingredient called a cryptographic hash function. These functions take in a (potentially) long string and output a short digest. In cryptography courses, we present these functions along with various “security definitions” they should meet, properties like collision resistance, pre-image resistance and so on. But out in the real world most schemes require much stronger assumptions in order to be proven secure. When we argue for the security of these schemes, we often demand that our hash functions be even stronger: we require that they must behave like random functions.
If you’re not sure what a random function is, you can read about it in depth here. You should just trust that it is a very strong and beautiful requirement for a hash function! But there is a fly in the ointment. Real-world hash functions cannot possibly be random functions. Specifically: concrete hash functions like SHA-2, SHA-3 etc. are characterized by the inevitable requirement that they possess compact, efficient algorithms that can compute their output. Random functions (of any usefulness) must not. Indeed, the most efficient description of a random function is essentially a giant (i.e., exponentially-sized in the length of the inputs to the function) lookup table. These functions cannot even be computed efficiently, because they’re so big.
So when we analyze schemes where hash functions must behave in this manner, we have to do some pretty suspicious things. The approach we take is bonkers. First, we analyze our schemes inside of an artificial “model” where efficient (polynomial-time) participants can somehow evaluate random functions, despite the fact that this is literally impossible. To make this apparent contradiction work, we “yank” the hash function logic into a magical box that lives outside that participants — this includes both honest participants in a protocol, as well as any adversaries who try to attack the scheme — and we force everyone to call out to that functionality. This new thing is called a “random oracle.”
One weird implication of this approach is that no party can ever know the code of the “hash function” they’re evaluating. They literally cannot know it, since in this model the hash function is comprised of an enormous random lookup table that’s much too big for anyone to actually know! This may seem like a not-very big deal, but it will be exceptionally important going forward.
Of course in the real world we do not have random oracles. I guess we could set up a special server that everyone in the world can call out to in order to compute their hash function values! But we don’t do that because it’s ridiculous. When want to deploy a scheme IRL, we do a terrible thing: we “instantiate the random oracle” by replacing it with an actual hash function like SHA-2 or SHA-3. Then everyone goes on their merry way, hoping that the security proof still has some meaning.
Let me be abundantly clear about this last part. From a theoretical perspective, any scheme “proven secure” in the random oracle model ceases to be provably secure the instant you replace the random oracle with a real (concrete) hash function like SHA-3. Put differently, it’s the equivalent of replacing your engine oil with Crisco. Your car may still run, but you are absolutely voiding the warranty.
But, but, but — and I stress the stammer — voiding your warranty does not mean your engine will become broken! In most of the places where we’ve done this awful random oracle “instantiation” thing (let’s be honest: almost every real-world deployed protocol) the instantiated protocols all seemed to work just fine.
(To be sure: we have certainly seen cryptographic protocols break down due to broken hash functions! But these breaks are almost always due to obvious hash function bugs that anyone can see, such as meaningful collisions being found. They were not magical breaks that come about because you rubbed the “theory lamp” wrong. As far as we can tell, in most cases if you use a “good enough” secure hash function to instantiate the random oracle, everything mostly goes fine.)
Now it should be noted that theoreticians were not happy about this cavalier approach. In the late 1990s, they rebelled and demonstrated something shocking: it was possible to build “contrived” cryptographic schemes that were provably secure in the random oracle model but totally broken when the oracle was “instantiated” with any hash function.
This was shocking, but honestly not that surprising once you’ve had someone else explain the basic idea. Most of these “counterexample schemes” followed from four simple observations:
In the (totally artificial) random oracle model, you don’t know a compact description of the hash function. You literally can’t know one, since it’s an exponentially-sized random function.
In the “instantiated” protocol, where you’ve replaced the random oracle with e.g., SHA-2, you very clearly must know a compact description of the hash function (for example, here is one.)
We can build a “contrived” scheme in which “knowledge of the description of the hash algorithm” forms a kind of backdoor that allows you to break the scheme!
In the random oracle model where you can’t ever possess this knowledge, the backdoor can never be triggered — hence the scheme is “secure.” In the real world where you instantiate the scheme with SHA-2, any clown can break it.
These results straddle the line between “brilliant” and “fundamentally kind of silly”. Brilliant because, wow! These schemes will be insecure when instantiated with any possible hash function! The random oracle model is a trap! But stupid because, I mean… duh!? In fact what we’re really showing is that our artificial model is artificial. If you build schemes that deliberately fall apart when any adversary knows the code for a hash function, then of course your schemes are going to be broken. You don’t need to be a genius to see that this is going to go poorly.
Nonetheless: theoreticians took the a victory lap and then moved on to ruining other people’s fun. Practitioners waited until every last one of them had lost interest, rolled their eyes, and said “let’s agree not to deploy schemes that do obviously stupid things.” And then they all went on deploying schemes that were only proven secure in the random oracle model. And this has described our world for 28 years or so.
But the theoreticians weren’t totally wrong, were they?
That is the $10,000,000,000 question.
As discussed above, many “contrived counterexample” schemes were built to demonstrate the danger of the random oracle model. But each of them was so obviously cartoonish that nobody would ever deploy one of them in practice. If your signature scheme includes 40 lines of code that essentially scream “FYI: THIS IS A BACKDOOR THAT UNLOCKS FOR ANYONE WHO KNOWS THE CODE OF SHA2”, the best solution is not to have a theoretical argument about whether this code is “valid.” The best solution is to delete the code and maybe write over your hard disk three times with random numbers before you burn it. Practitioners generally do not feel threatened by artificial counterexamples.
But a danger remains.
Cryptographic schemes have been getting more complicated and powerful over time. Since I explained the danger in a previous blog post I wrote five years ago, I’m going to save myself some trouble — and also make myself look prescient:
The probability of [a malicious scheme slipping past detection] accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computationand others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.
Let’s drill down on this a moment.
One relatively recent development in cryptography is the rise of succinct “ZK” or “verifiable computation” schemes that allow an untrusted person to prove statements about arbitrary programs. In general terms, these systems allow a Prover (e.g., me) to prove statements of the following form: (1) I know an input to a [publicly-known] program, such that (2) the program, when run on that input, will output “True.”
The neat thing about these systems is that after running the program, I can author a short (aka “succinct”) proof that will convince you that both of these things are true. Even better, I can hand that short proof (sometimes called an “argument”) to anyone in the world. They can run a Verify algorithm to check that the proof is valid, and if it agrees, then they never need to repeat the original computation. Critically, the time required to verify the proof is usually much less than the time required to re-check the program execution, even for really complicated program executions. The resulting systems are called arguments of knowledge and they go by various cool acronyms: SNARGs, SNARKs, STARKs, and sometimes IVC. (The Ethereum world sometimes lumps these together under the moniker “ZK”, for historical reasons we will not dig into.)
This technology has proven to be an exciting and necessary solution for the cryptocurrency world, because that world happens to have a real problem on its hands. Concretely: they’ve all noticed that blockchainsare very slow. Those systems require thousands of different computers to verify (“check the validity of”) every financial transaction they see, which places enormous limitations on transaction throughput.
“Succinct” proof systems offer a perfect solution to this conundrum.
Rather than submitting millions of individual transactions to a big, slow blockchain, the blockchain can be broken up. Distinct servers called “rollups” can verify big batches of transactions independently. They can each use a succinct proof system to prove that they ran the transaction-verification program correctly on all those transactions. The base-level blockchains no longer need to look at every single transaction. They only need to verify the short “proofs” authored by the rollup servers, and (magically!) this ensures that all of the transactions are verified — but with the base-level blockchain doing vastly less work. In theory this allows a massive improvement in blockchain throughput, mostly without sacrificing security.
An even cooler fact is that these proof systems can in some cases be applied recursively. This is due to a cute feature: the algorithm for verifying a proof is, after all, itself just a computer program. So I can run that program on some other proofs as input — and then I can use the proof system to prove that I ran that programcorrectly.
To give a more concrete application:
Imagine 1,000 different servers each run a program that verifies a distinct batch of 1,000 transactions. Each server produces a succinct proof that they ran their program correctly (i.e., their batch is correct.)
Now a different server can take in each of those 1,000 different proofs. And it can run a Verify program that goes through each of those 1,000 proofs and verifies that each one is correct. It outputs a proof that it ran this program correctly.
The result is a single “short” proof that proves all 1,000,000 transactions are correct!
I even made a not-very-excellent picture to try to illustrate how this can look:
Example of recursive proof usage. At the bottom we have some real programs, each of which gets its own proof. Then one level up we have a program that simply verifies the proofs from the bottom level. And at the top we have another program that verifies many proofs from the second level! (Many programs not shown.)
This recursive stuff is really useful, and I promise that it will be relevant later.
So what?
The question you might be asking is: what in the world does this have to do with random oracle counterexamples?!
Since these proof systems are now powerful enough to run arbitrary programs (sometimes implemented in the form of arithmetic or boolean “circuits”), there is now a possibility that sneaky counterexample “backdoors” could be smuggled in within the programs we are proving things about. This would mean that even if the actual proving scheme has no obvious backdoors in its code, the actual programs would be able to do creepy stuff that would undermine security for the whole system. Our practitioner friends would no longer be able to exercise their (very reasonable) heuristic of “don’t deploy code that does obviously suspicious things” because, while their implementation might not do stupid things, some user try to run it with a malicious program that does.
(A good analogy is to imagine that your Nintendo system has no exploits built into it, but any specific game might sneak in a nasty piece of code that could blow everything up.)
A philosophical interlude
This has been a whole lot, and there’s lots more to come.
To give you a break, I want pause for a moment to talk about philosophy, metaphysics (meta-cryptography?), or maybe just the Meaning of Life. More concretely, at this point we need to stop and ask a very reasonable question: how much does this threat model even matter?And what even is this threat model?
Allow me to explain. Imagine that we have a proving system that is basically not backdoored. It may or may not be provably secure, but by itself the proving system itself does not contain any obvious backdoors that will cause it to malfunction, even if you implement it using a concrete hash function like SHA-3.
Now imagine that someone comes along and writes a program called “Verify_Ethereum_Transactions_EVIL.py” that we will want to run and prove using our proof system. Based on the name, we can assume this program was developed by a shady engineer who maliciously decide to add a “backdoor” to the code! Instead of merely verifying Ethereum transactions as you might hope for, the functionality of this program does something nastier:
“Given some input, output True if the input comprises 1,000 valid Ethereum transactions… OR output True if the input (or the program code itself) contains a description of the hash function used by the proving system.”
This would be really bad for your cryptocurrency network! Any clever user could submit invalid Ethereum transactions to be verified by this program and it would happily output “True.” If any cryptocurrency network then trusted the proof (to mean “these transactions are actually valid”) then you could potentially use this trick to steal lots of money.
But also let me be clear: this would also be an incredibly stupid program to deploy in your cryptocurrency network.
The whole point of a proof system is that it proves you ran a program successfully, including whatever logic happens to be within those programs. If those programs have obvious backdoors inside of them, then proving you ran those programs means you’re also proving that you might have exercised any backdoors in those programs. If the person writing your critical software is out to get you, and/or you don’t carefully audit their output, you will end up being very regretful. And there are many, many ways to add backdoors to software! (Just to illustrate this, there used to be an entire competition called the “Underhanded C Contest” where people would compete to write C programs full of malicious code that was hard to catch. The results were pretty impressive!)
So it’s worthwhile to ask whether this is really a surprise. In the past we knew that (1) if your silly cryptographic scheme had weird code that made it insecure “to anyone who knows how to compute SHA-2“, then (2) it would really beinsecure in the real world, since any idiot can download the code for SHA-2, and (3) you should not deploy schemes that have obvious backdoors.
So with this context in mind, let’s talk about what kind of bad things might happen. These can be divided into “best case“, “second worst case” and “oh hell, holy sh*t.“
In the best case, this type of attack might simply move the scary backdoor code out from the cryptographic proving system, and into the modular “application programs” that can be fed into the proving system You still need to make sure the scheme implementation doesn’t have silly backdoors — like special code that breaks everything if you know the code for SHA-2. But now you also need to make sure every program you run using this system doesn’t have a similar backdoors. But to be clear: you kind of had to audit your programs for backdoors anyway!
In fairness, the nature of these cryptographic backdoors is that they might be more subtle than a typical software backdoor. What I mean here is that ordinary software reviewers might not recognize it, and only only an experienced cryptographer will identify that something sketchy is happening. But even if the bug is hard to identify, it’s still a bug — a bug in one specific piece of code — and most critically, it would only affect your own application if you deployed it.
Of course there are worse possibilities as well.
In the second worst case, perhaps the bugdoor can be built into the application code in some clever way that is deeply subtle and fundamentally difficult for code auditors to detect — even if they know how to look for it. Perhaps it could somehow be cryptographically obfuscated, so no code review will detect it! Recursive proof systems are worrying when it comes to this concern, since the “bug” might exist multiple layers down in a tree of recursive proofs, and you might not have the code for all those lower-level programs.1 It’s possible that the set of “bad code behaviors” we we’d need to audit the code for is so large and undefined that we can no longer apply simple heuristics to catch the bad stuff!
This would be very scary. But it is certainly not the worst case.
The (“oh crap!”) worst case: with recursive proofs there is an even more terrible thing that could theoretically happen. Recall that a single top-level recursive proof can recursively verify thousands of different programs. Many of those programs will likely be written by careful, honest developers. Others could be written by scammers. Clearly if the scammers’ code contains bugs (or flaws) then we should expect those bugs to make the scammers’ own programs less secure, at whatever goal they’re supposed to accomplish. So far none of this is surprising. But ideally what we should hope is that any backdoors in the scammers’ programs will remain isolated to the scammers’ code. They should not “jump across program boundaries” and thus undermine the security of the well-written, honest programs elsewhere in the recursive proof stack.
Now imagine a situation where this is not true. That is, a clever bug in one “program” anywhere in the tree could somehow make any other program (proof) in the entire tree of proofs insecure. This is akin to getting one malicious program onto a Linux box, then using it to compromise the Linux kernel and thus undermine the security of any application running on the system. Maybe this seems unlikely? Actually to me it seems genuinely fantastic, but again, we’re in Narnia at this point. Who knows what’s possible!
This is the very worst case. I don’t think it could happen, but… who knows?
This is the scary thing about what can happen once we leave the world of provable security. Without some fundamental security guarantees we can rely on, it’s possible that the set of attacks we might suffer could be very limited. But they could also be fundamentally unbounded! And that’s where I have to leave this post for the moment.
We might imagine, for example, that a recursive Verify program might just take in the hash (or commitment) to a program. And then a Prover could simply state “well, I know a real program that matches this commitment AND ALSO I know an input that satisfies the program.” This means the program wouldn’t technically be available to every auditor, only the hash of the program. I am doing a lot of handwaving here, but this is all possible.
Recently I came across a fantastic new paper by a group of NYU and Cornell researchers entitled “How to think about end-to-end encryption and AI.” I’m extremely grateful to see this paper, because while I don’t agree with every one of its conclusions, it’s a good first stab at an incredibly important set of questions.
I was particularly happy to see people thinking about this topic, since it’s been on my mind in a half-formed state this past few months. On the one hand, my interest in the topic was piqued by the deployment of new AI assistant systems like Google’s scam call protection and Apple Intelligence, both of which aim to put AI basically everywhere on your phone — even, critically, right in the middle of your private messages. On the other hand, I’ve been thinking about the negative privacy implications of AI due to the recent European debate over “mandatory content scanning” laws that would require machine learning systems to scan virtually every private message you send.
While these two subjects arrive from very different directions, I’ve come to believe that they’re going to end up in the same place. And as someone who has been worrying about encryption — and debating the “crypto wars” for over a decade — this has forced me to ask some uncomfortable questions about what the future of end-to-end encrypted privacy will look like. Maybe, even, whether it actually has a future.
But let’s start from an easier place.
What is end-to-end encryption, and what does AI have to do with it?
From a privacy perspective, the most important story of the past decade or so has been the rise of end-to-end encrypted communications platforms. Prior to 2011, most cloud-connected devices simply uploaded their data in plaintext. For many people this meant their private data was exposed to hackers, civil subpoenas, government warrants, or business exploitation by the platforms themselves. Unless you were an advanced user who used tools like PGP and OTR, most end-users had to suffer the consequences.
Headline about the Salt Typhoon group, an APT threat that has essentially owned US telecom infrastructure by compromising “wiretapping” systems. Oh how the worm has turned.
Around 2011 our approach to data storage began to evolve. This began with messaging apps like Signal, Apple iMessage and WhatsApp, all of which began to roll out default end-to-end encryption for private messaging. This technology changed the way that keys are managed, to ensure that servers would never see the plaintext content of your messages. Shortly afterwards, phone (OS) designers like Google, Samsung and Apple began encrypting the data stored locally on your phone, a move that occasioned some famous arguments with law enforcement. More recently, Google introduced default end-to-end encryption for phone backups, and (somewhat belatedly) Apple has begun to follow.
Now I don’t want to minimize the engineering effort required here, which was massive. But these projects were also relatively simple. By this I mean: all of the data encrypted in these projects shared a common feature, which is that none of it needed to be processed by a server.
The obvious limitation of end-to-end encryption is that while it can hide content from servers, it can also make it very difficult for servers to compute on that data.1 This is basically fine for data like cloud backups and private messages, since that content is mostly interesting to clients. For data that actually requires serious processing, the options are much more limited. Wherever this is the case — for example, consider a phone that applies text recognition to the photos in your camera roll — designers typically get forced into a binary choice. On the one hand they can (1) send plaintext off to a server, in the process resurrecting many of the earlier vulnerabilities that end-to-end encryption sought to close. Or else (2) they can limit their processing to whatever can be performedon the device itself.
The problem with the second solution is that your typical phone only has a limited amount of processing power, not to mention RAM and battery capacity. Even the top-end iPhone will typically perform photo processing while it’s plugged in at night, just to avoid burning through your battery when you might need it. Moreover, there is a huge amount of variation in phone hardware. Some flagship phones will run you $1400 or more, and contain onboard GPUs and neural engines. But these phones aren’t typical, in the US and particularly around the world. Even in the US it’s still possible to buy a decent Android phone for a couple hundred bucks, and a lousy one for much less. There is going to be a vast world of difference between these devices in terms of what they can process.
So now we’re finally ready to talk about AI.
Unless you’ve been living under a rock, you’ve probably noticed the explosion of powerful new AI models with amazing capabilities. These include Large Language Models (LLMs) which can generate and understand complex human text, as well as new image processing models that can do amazing things in that medium. You’ve probably also noticed Silicon Valley’s enthusiasm to find applications for this tech. And since phone and messaging companies are Silicon Valley, your phone and its apps are no exception. Even if these firms don’t quite know how AI will be useful to their customers, they’ve already decided that models are the future. In practice this has already manifested in the deployment of applications such as text message summarization, systems that listen to and detect scam phone calls, models that detect offensive images, as well as new systems that help you compose text.
And these are obviously just the beginning.
If you believe the AI futurists — and in this case, I actually think you should — all these toys are really just a palate cleanser for themain entrée still to come: the deployment of “AI agents,” aka, Your Plastic Pal Who’s Fun To Be With.
In theory these new agent systems will remove your need to ever bother messing with your phone again. They’ll read your email and text messages and they’ll answer for you. They’ll order your food and find the best deals on shopping, swipe your dating profile, negotiate with your lenders, and generally anticipate your every want or need. The only ingredient they’ll need to realize this blissful future is virtually unrestricted access to all your private data, plus a whole gob of computing power to process it.
Which finally brings us to the sticky part.
Since most phones currently don’t have the compute to run very powerful models, and since models keep getting better and in some cases more proprietary, it is likely that much of this processing (and the data you need processed) will have to be offloaded to remote servers.
And that’s the first reason that I would say that AI is going to be the biggest privacy story of the decade. Not only will we soon be doing more of our compute off-device, but we’ll be sending a lot more of our private data. This data will be examined and summarized by increasingly powerful systems, producing relatively compact but valuable summaries of our lives. In principle those systems will eventually know everything about us and about our friends. They’ll read our most intimate private conversations, maybe they’ll even intuit our deepest innermost thoughts. We are about to face many hard questions about these systems, including some difficult questions about whether they will actually be working for us at all.
But I’ve gone about two steps farther than I intended to. Let’s start with the easier questions.
What does AI mean for end-to-end encrypted messaging?
Modern end-to-end encrypted messaging systems provide a very specific set of technical guarantees. Concretely, an end-to-end encrypted system is designed to ensure that plaintext message content in transit is not available anywhere except for the end-devices of the participants and (here comes the Big Asterisk) anyone the participants or their devices choose to share it with.
The last part of that sentence is very important, and it’s obvious that non-technical users get pretty confused about it. End-to-end encrypted messaging systems are intended to deliver data securely. They don’t dictate what happens to it next. If you take screenshots of your messages, back them up in plaintext, copy/paste your data onto Twitter, or hand over your device in response to a civil lawsuit — well, that has really nothing to do with end-to-end encryption. Once the data has been delivered from one end to the other, end-to-end encryption washes its hands and goes home
Of course there’s a difference between technical guarantees and the promises that individual service providers make to their customers. For example, Apple says this about its iMessage service:
I can see how these promises might get a little bit confusing! For example, imagine that Apple keeps its promise to deliver messages securely, but then your (Apple) phone goes ahead and uploads (the plaintext) message content to a different set of servers where Apple really can decrypt it. Apple is absolutely using end-to-end encryption in the dullest technical sense… yet is the statement above really accurate? Is Apple keeping its broader promise that it “can’t decrypt the data”?
Note: I am not picking on Apple here! This is just one paragraph from a security overview. In a separate legal document, Apple’s lawyers have written a zillion words to cover their asses. My point here is just to point out that a technical guarantee is different from a user promise.
Making things more complicated, this might not be a question about your own actions. Consider the promise you receive every time you start a WhatsApp group chat (at right.) Now imagine that some other member of the group — not you, but one of your idiot friends — decides to turn on some service that uploads (your) received plaintext messages to WhatsApp. Would you still say the message you received still accurately describes how WhatsApp treats your data? If not, how should it change?
In general, what we’re asking here is a question about informedconsent. Consent is complicated because it’s both a moral concept and also a legal one, and because the law is different in every corner of the world. The NYU/Cornell paper does an excellent job giving an overview of the legal bits, but — and sorry to be a bummer here — I really don’t want you to expect the law to protect you. I imagine that some companies will do a good job informing their users, as a way to earn trust. Other companies, won’t. Here in the US, they’ll bury your consent inside of an inscrutable “terms of service” document you never read. In the EU they’ll probably just invent a new type of cookie banner.
This is obviously a joke. But, like, maybe not?
So to summarize: in the near term we’re headed to a world where we should expect increasing amounts of our data to be processed by AI, and a lot of that processing will (very likely) be off-device. Good end-to-end encrypted services will actively inform you that this is happening, so maybe you’ll have the chance to opt-in or opt-out. But if it’s truly ubiquitous (as our futurist friends tell us it will be) then probably your options will be end up being pretty limited.
Some ideas for avoiding AI inference on your private messages (cribbed from Mickens.)
Fortunately all is not lost. In the short term, a few people have been thinking hard about this problem.
Trusted Hardware and Apple’s “Private Cloud Compute”
The good news is that our coming AI future isn’t going to be a total privacy disaster. And in large part that’s because a few privacy-focused companies like Apple have anticipated the problems that ML inference will create (namely the need to outsource processing to better hardware) and have tried to do something about it. This something is essentially a new type of cloud-based computer that Apple believes is trustworthy enough to hold your private data.
Quick reminder: as we already discussed, Apple can’t rely on every device possessing enough power to perform inference locally. This means inference will be outsourced to a remote cloud machine. Now the connection from your phone to the server can be encrypted, but the remote machine will still receive confidential data that it must see in cleartext. This poses a huge risk at that machine. It could be compromised and the data stolen by hackers, or worse, Apple’s business development executives could find out about it.
Apple’s approach to this problem is called “Private Cloud Compute” and it involves the use of special trusted hardware devices that run in Apple’s data centers. A “trusted hardware device” is a kind of computer that is locked-down in both the physical and logical sense. Apple’s devices are house-made, and use custom silicon and software features. One of these is Apple’s Secure Boot, which ensures that only “allowed” operating system software is loaded. The OS then uses code-signing to verify that only permitted software images can run. Apple ensures that no long-term state is stored on these machines, and also load-balances your request to a different random server every time you connect. The machines “attest” to the application software they run, meaning that they announce a hash of the software image (which must itself be stored in a verifiable transparency log to prevent any new software from surreptitiously being added.) Apple even says it will publish its software images (though unfortunately not [update:all of] the source code) so that security researchers can check them over for bugs.
The goal of this system is to make it hard for both attackers and Apple employees to exfiltrate data from these devices. I won’t say I’m thrilled about this. It goes without saying that Apple’s approach is a vastly weaker guarantee than encryption — it still centralizes a huge amount of valuable data, and its security relies on Apple getting a bunch of complicated software and hardware security features right, rather than the mathematics of an encryption algorithm. Yet this approach is obviously much better than what’s being done at companies like OpenAI, where the data is processed by servers that employees (presumably) can log into and access.
Although PCC is currently unique to Apple, we can hope that other privacy-focused services will soon crib the idea — after all, imitation is the best form of flattery. In this world, if we absolutely must have AI everywhere, at least the result will be a bit more secure than it might otherwise be.
Who does your AI agent actually work for?
There are so many important questions that I haven’t discussed in this post, and I feel guilty because I’m not going to get to them. I have not, for example, discussed the very important problem of the data used to train (and fine-tune) future AI models, something that opens entire oceans of privacy questions. If you’re interested, these problems are discussed in the NYU/Cornell paper.
But rather than go into that, I want to turn this discussion towards a different question: namely, who are these models actually going to be working for?
As I mentioned above, over the past couple of years I’ve been watching the UK and EU debate new laws that would mandate automated “scanning” of private, encrypted messages. The EU’s proposal is focused on detecting both existing and novel child sexual abuse material (CSAM); it has at various points also included proposals to detect audio and textual conversations that represent “grooming behavior.” The UK’s proposal is a bit wilder and covers many different types of illegal content, including hate speech, terrorism content and fraud — one proposed amendment even covered “images of immigrants crossing the Channel in small boats.”
While scanningfor knownCSAM does not require AI/ML, detecting nearly every one of the other types of content would require powerful ML inference that can operate on your private data. (Consider, for example, what is involved in detecting novel CSAM content.) Plans to detect audio or textual “grooming behavior” and hate speech will require even more powerful capabilities: not just speech-to-text capabilities, but also some ability to truly understand the topic of human conversations without false positives. It is hard to believe that politicians even understand what they’re asking for. And yet some of them are asking for it, in a few cases very eagerly.
These proposals haven’t been implemented yet. But to some degree this is because ML systems that securely process private data are very challenging to build, and technical platforms have been resistant to building them. Now I invite you to imagine a world where we voluntarily go ahead and build general-purpose agents that are capable of all of these tasks and more. You might do everything in your technical power to keep them under the user’s control, but can you guarantee that they will remain that way?
Or put differently: would you even blame governments for demanding access to a resource like this? And how would you stop them? After all, think about how much time and money a law enforcement agency could save by asking your agent sophisticated questions about your behavior and data, questions like: “does this user have any potential CSAM,” or “have they written anything that could potentially be hate speech in their private notes,” or “do you think maybe they’re cheating on their taxes?” You might even convince yourself that these questions are “privacy preserving,” since no human police officer would ever rummage through your papers, and law enforcement would only learn the answer if you were (probably) doing something illegal.
This future worries me because it doesn’t really matter what technical choices we make around privacy. It does not matter if your model is running locally, or if it uses trusted cloud hardware — once a sufficiently-powerful general-purpose agent has been deployed on your phone, the only question that remains is who is given access to talk to it. Will it be only you? Or will we prioritize the government’s interest in monitoring its citizens over various fuddy-duddy notions of individual privacy.
And while I’d like to hope that we, as a society, will make the right political choice in this instance, frankly I’m just not that confident.
Notes:
(A quick note: some will suggest that Apple should use fully-homomorphic encryption [FHE] for this calculation, so the private data can remain encrypted. This is theoretically possible, but unlikely to be practical. The best FHE schemes we have today really only work for evaluating very tiny ML models, of the sort that would be practical to run on a weak client device. While schemes will get better and hardware will too, I suspect this barrier will exist for a long time to come.)
This blog is reserved for more serious things, and ordinarily I wouldn’t spend time on questions like the above. But much as I’d like to spend my time writing about exciting topics, sometimes the world requires a bit of what Brad Delong calls “Intellectual Garbage Pickup,” namely: correcting wrong, or mostly-wrong ideas that spread unchecked across the Internet.
This post is inspired by the recent and concerning news that Telegram’s CEO Pavel Durov has been arrested by French authorities for its failure to sufficiently moderate content. While I don’t know the details, the use of criminal charges to coerce social media companies is a pretty worrying escalation, and I hope there’s more to the story.
But this arrest is not what I want to talk about today.
What I do want to talk about is one specific detail of the reporting. Specifically: the fact that nearly every news report about the arrest refers to Telegram as an “encrypted messaging app.” Here are just afewexamples:
This phrasing drives me nuts because in a very limited technical sense it’s not wrong. Yet in every sense that matters, it fundamentally misrepresents what Telegram is and how it works in practice. And this misrepresentation is bad for both journalists and particularly for Telegram’s users, many of whom could be badly hurt as a result.
Now to the details.
Does Telegram have encryption or doesn’t it?
Many systems use encryption in some way or another. However, when we talk about encryption in the context of modern private messaging services, the word typically has a very specific meaning: it refers to the use of default end-to-end encryption to protect users’ message content. When used in an industry-standard way, this feature ensures that every message will be encrypted using encryption keys that are only known to the communicating parties, and not to the service provider.
From your perspective as a user, an “encrypted messenger” ensures that each time you start a conversation, your messages will only be readable by the folks you intend to speak with. If the operator of a messaging service tries to view the content of your messages, all they’ll see is useless encrypted junk. That same guarantee holds for anyone who might hack into the provider’s servers, and also, for better or for worse, to law enforcement agencies that serve providers with a subpoena.
Telegram clearly fails to meet this stronger definition for a simple reason: it does not end-to-end encrypt conversations by default. If you want to use end-to-end encryption in Telegram, you must manually activate an optional end-to-end encryption feature called “Secret Chats” for every single private conversation you want to have. The feature is explicitly not turned on for the vast majority of conversations, and is only available for one-on-one conversations, and never for group chats with more than two people in them.
As a kind of a weird bonus, activating end-to-end encryption in Telegram is oddly difficult for non-expert users to actually do.
For one thing, the button that activates Telegram’s encryption feature is not visible from the main conversation pane, or from the home screen. To find it in the iOS app, I had to click at least four times — once to access the user’s profile, once to make a hidden menu pop up showing me the options, and a final time to “confirm” that I wanted to use encryption. And even after this I was not able to actually have an encrypted conversation, since Secret Chats only works if your conversation partner happens to be online when you do this.
Starting a “secret chat” with my friend Michael on the latest Telegram iOS app. From an ordinary chat screen this option isn’t directly visible. Getting it activated requires four clicks: (1) to get to Michael’s profile (left image), (2) on the “…” button to display a hidden set of options (center image), (3) on “Start Secret Chat”, and (4) on the “Are you sure…” confirmation dialog. After that I’m still unable to send Michael any messages, because Telegram’s Secret Chats can only be turned on if the other user is also online.
Overall this is quite different from the experience of starting a new encrypted chat in an industry-standard modern messaging application, which simply requires you to open a new chat window.
While it might seem like I’m being picky, the difference in adoption between default end-to-end encryption and this experience is likely very significant. The practical impact is that the vast majority of one-on-one Telegram conversations — and literally every single group chat — are probably visible on Telegram’s servers, which can see and record the content of all messages sent between users. That may or may not be a problem for every Telegram user, but it’s certainly not something we’d advertise as particularly well encrypted.
(If you’re interested in the details, as well as a little bit of further criticism of Telegram’s actual encryption protocols, I’ll get into what we know about that further below.)
But wait, does default encryption really matter?
Maybe yes, maybe no! There are two different ways to think about this.
One is that Telegram’s lack of default encryption is just finefor many people. The reality is that many users don’t choose Telegram for encrypted private messaging at all. For plenty of people, Telegram is used more like a social media network than a private messenger.
Getting more specific, Telegram has two popular features that makes it ideal for this use-case. One of those is the ability to create and subscribe to “channels“, each of which works like a broadcast network where one person (or a small number of people) can push content out to millions of readers. When you’re broadcasting messages to thousands of strangers in public, maintaining the secrecy of your chat content isn’t as important.
Telegram also supports large public group chats that can include thousands of users. These groups can be made open for the general public to join, or they can set up as invite-only. While I’ve never personally wanted to share a group chat with thousands of people, I’m told that many people enjoy this feature. In the large and public instantiation, it also doesn’t really matter that Telegram group chats are unencrypted — after all, who cares about confidentiality if you’re talking in the public square?
But Telegram is not limited to just those features, and many users who join for them will also do other things.
Imagine you’re in a “public square” having a large group conversation. In that setting there may be no expectation of strong privacy, and so end-to-end encryption doesn’t really matter to you. But let’s say that you and five friends step out of the square to have a side conversation. Does that conversation deserve strong privacy? It doesn’t really matter what you want, because Telegram won’t provide it, at least not with encryption that protects you from sharing your content with Telegram servers.
Similarly, imagine you use Telegram for its social media-like features, meaning that you mainly consume content rather than producing it. But one day your friend, who also uses Telegram for similar reasons, notices you’re on the platform and decides she wants to send you a private message. Are you concerned about privacy now? And are you each going to manually turn on the “Secret Chat” feature — even though it requires four explicit clicks through hidden menus, and even though it will prevent you from communicating immediately if one of you is offline?
My strong suspicion is that many people who join Telegram for its social media features also end up using it to communicate privately. And I think Telegram knows this, and tends to advertise itself as a “secure messenger” and talk about the platform’s encryption features precisely because they know it makes people feel more comfortable. But in practice, I also suspect that very few of those users are actually using Telegram’s encryption. Many of those users may not even realize they have to turn encryption on manually, and think they’re already using it.
Which brings me to my next point.
Telegram knows its encryption is difficult to turn on, and they continue to promote their product as a secure messenger
Telegram’s encryption has been subject toheavy criticismsinceat least 2016 (and possibly earlier) for many of the reasons I outlined in this post. In fact, many of these criticisms were made by experts including myself, in years-old conversations with Pavel Durov on Twitter.1
Although the interaction with Durov could sometimes be harsh, I still mostly assumed good faith from Telegram back in those days. I believed that Telegram was busy growing their network and that, in time, they would improve the quality and usability of the platform’s end-to-end encryption: for example, by activating it as a default, providing support for group chats, and making it possible to start encrypted chats with offline users. I assumed that while Telegram might be a follower rather than a leader, it would eventually reach feature parity with the encryption protocols offered by Signal and WhatsApp. Of course, a second possibility was that Telegram would abandon encryption entirely — and just focus on being a social media platform.
What’s actually happened is a lot more confusing to me.
Instead of improving the usability of Telegram’s end-to-end encryption, the owners of Telegram have more or less kept their encryption UX unchanged since 2016. While there have been a few upgrades to the underlying encryption algorithms used by the platform, the user-facing experience of Secret Chats in 2024 is almost identical to the one you’d have seen eight years ago. This, despite the fact that the number of Telegram users has grown by 7-9x during the same time period.
At the same time, Telegram CEO Pavel Durov has continued to aggressively market Telegram as a “secure messenger.” Most recently he issued a scathing criticism of Signal and WhatsApp on his personal Telegram channel, implying that those systems were backdoored by the US government, and only Telegram’s independent encryption protocols were really trustworthy.
While this might be a reasonable nerd-argument if it was taking place between two platforms that both supported default end-to-end encryption, Telegram really has no legs to stand on in this particular discussion. Indeed, it no longer feels amusing to see the Telegram organization urge people away from default-encrypted messengers, while refusing to implement essential features that would widely encrypt their own users’ messages. In fact, it’s starting to feel a bit malicious.
What about the boring encryption details?
This is a cryptography blog and so I’d be remiss if I didn’t spend at least a little bit of time on the boring encryption protocols. I’d also be missing a good opportunity to let my mouth gape open in amazement, which is pretty much what happens every time I look at the internals of Telegram’s encryption.
I’m going to handle this in one paragraph to reduce the pain, and you can feel free to skip past it if you’re not interested.
According to what I think is the latest encryption spec, Telegram’s Secret Chats feature is based on a custom protocol called MTProto 2.0. This system uses 2048-bit* finite-field Diffie-Hellman key agreement, with group parameters (I think) chosen by the server.* (Since the Diffie-Hellman protocol is only executed interactively, this is why Secret Chats cannot be set up when one user is offline.*) MITM protection is handled by the end-users, who must compare key fingerprints. There are some weird random nonces provided by the server, which I don’t fully understands the purpose of* — and that in the past used to actively make the key exchange totally insecure against a malicious server (but this has long since been fixed.*) The resulting keys are then used to power the most amazing, non-standard authenticated encryption mode ever invented, something called “Infinite Garble Extension” (IGE) based on AES and with SHA2 handling authentication.*
NB: Every place I put a “*” in the paragraph above is a point where expert cryptographers would, in the context of something like a professional security audit, raise their hands and ask a lot of questions. I’m not going to go further than this. Suffice it to say that Telegram’s encryption is unusual.
If you ask me to guess whether the protocol and implementation of Telegram Secret Chats is secure, I would say quitepossibly. To be honest though, it doesn’t matter how secure something is if people aren’t actually using it.
Is there anything else I should know?
Yes, unfortunately. Even though end-to-end encryption is one of the best tools we’ve developed to prevent data compromise, it is hardly the end of the story. One of the biggest privacy problems in messaging is the availability of loads of meta-data — essentially data about who uses the service, who they talk to, and when they do that talking.
This data is not typically protected by end-to-end encryption. Even in applications that are broadcast-only, such as Telegram’s channels, there is plenty of useful metadata available about who is listening to a broadcast. That information alone is valuable to people, as evidenced by the enormous amounts of money that traditional broadcasters spend to collect it. Right now all of that information likely exists on Telegram’s servers, where it is available to anyone who wants to collect it.
I am not specifically calling out Telegram for this, since the same problem exists with virtually every other social media network and private messenger. But it should be mentioned, just to avoid leaving you with the conclusion that encryption is all we need.
Main photo “privacy screen” by Susan Jane Golding, used under CC license.
Notes:
I will never find all of these conversations again, thanks to Twitter search being so broken. If anyone can turn them up I’d appreciate it.
Update(April 19): Yilei Chen announced the discovery of a bug in the algorithm, which he does not know how to fix. This was independently discovered by Hongxun Wu and Thomas Vidick. At present, the paper does not provide a polynomial-time algorithm for solving LWE.
If you’re a normal person — that is, a person who doesn’t obsessively follow the latest cryptography news — you probably missed last week’s cryptography bombshell. That news comes in the form of a new e-print authored by Yilei Chen, “Quantum Algorithms for Lattice Problems“, which has roiled the cryptography research community. The result is now being evaluated by experts in lattices and quantum algorithm design (and to be clear, I am not one!) but if it holds up, it’s going to be quite a bad day/week/month/year for the applied cryptography community.
Rather than elaborate at length, here’s quick set of five bullet-points giving the background.
(1) Cryptographers like to build modern public-key encryption schemes on top of mathematical problems that are believed to be “hard”. In practice, we need problems with a specific structure: we can construct efficient solutions for those who hold a secret key, or “trapdoor”, and yet also admit no efficient solution for folks who don’t. While many problems have been considered (and often discarded), most schemes we use today are based on three problems: factoring (the RSA cryptosystem), discrete logarithm (Diffie-Hellman, DSA) and elliptic curve discrete logarithm problem (EC-Diffie-Hellman, ECDSA etc.)
(2) While we would like to believe our favorite problems are fundamentally “hard”, we know this isn’t really true. Researchers have devised algorithms that solve all of these problems quite efficiently (i.e., in polynomial time) — provided someone figures out how to build a quantum computer powerful enough to run the attack algorithms. Fortunately such a computer has not yet been built!
(3) Even though quantum computers are not yet powerful enough to break our public-key crypto, the mere threat of future quantum attacks has inspired industry, government and academia to join forces Fellowship-of-the-Ring-style in order to tackle the problem right now. This isn’t merely about future-proofing our systems: even if quantum computers take decades to build, future quantum computers could break encrypted messages we send today!
(4) One conspicuous outcome of this fellowship is NIST’s Post-Quantum Cryptography (PQC) competition: this was an open competition designed to standardize “post-quantum” cryptographic schemes. Critically, these schemes must be based on different mathematical problems — most notably, problems that don’t seem to admit efficient quantum solutions.
(5) Within this new set of schemes, the most popular class of schemes are based on problems related to mathematical objects called lattices. NIST-approved schemes based on lattice problems include Kyber and Dilithium (which I wrote about recently.) Lattice problems are also the basis of several efficient fully-homomorphic encryption (FHE) schemes.
This background sets up the new result.
Chen’s (not yet peer-reviewed) preprint claims anew quantum algorithm that efficiently solves the “shortest independent vector problem” (SIVP, as well as GapSVP) in lattices with specific parameters. If it holds up, the result could (with numerous important caveats) allow future quantum computers to break schemes that depend on the hardness of specific instances of these problems. The good news here is that even if the result is correct, the vulnerable parameters are very specific: Chen’s algorithm does not immediately apply to the recently-standardized NIST algorithms such as Kyber or Dilithium. Moreover, the exact concrete complexity of the algorithm is not instantly clear: it may turn out to be impractical to run, even if quantum computers become available.
But there is a saying in our field that attacks only get better. If Chen’s result can be improved upon, then quantum algorithms could render obsolete an entire generation of “post-quantum” lattice-based schemes, forcing cryptographers andindustryback to the drawing board.
In other words, both a great technical result — and possibly a mild disaster.
As previously mentioned: I am neither an expert in lattice-based cryptography nor quantum computing. The folks who are those things are very busy trying to validate the writeup: and more than a few big results have fallen apart upon detailed inspection.For those searching for the latest developments, here’s a nice writeup by Nigel Smart that doesn’t tackle the correctness of the quantum algorithm (see updates at the bottom), but does talk about the possible implications for FHE and PQC schemes (TL;DR: bad for some FHE schemes, but really depends on the concrete details of the algorithm’s running time.) And here’s another brief note on a “bug” that was found in the paper, that seems to have been quickly addressed by the author.
Up until this week I had intended to write another long wonky post about complexity theory, lattices, and what it all meant for applied cryptography. But now I hope you’ll forgive me if I hold onto that one, for just a little bit longer.
It’s been a while since I wrote an “attack of the week” post, and the fault for this is entirely mine. I’ve been much too busy writing boring posts about Schnorr signatures! But this week’s news brings an exciting story with both technical and political dimensions: new reports claim that Chinese security agencies have developed a technique to trace the sender of AirDrop transmissions.
Typically my “attack of the week” posts are intended to highlight recent research. What’s unusual about this one is that the attack is not really new; it was discovered way back in 2019, when a set of TU Darmstadt researchers — Heinrich, Hollick, Schneider, Stute, and Weinert — reverse-engineered the Apple AirDrop protocol and disclosed several privacy flaws to Apple. (The resulting paper, which appeared in Usenix Security 2021 can be found here.)
What makes this an attack of the week is a piece of news that was initially reported by Bloomberg (here’s some other coveragewithout paywall) claiming that researchers in China’s Beijing Wangshendongjian Judicial Appraisal Institute have used these vulnerabilities to help police to identify the sender of “unauthorized” AirDrop materials, using a technique based on rainbow tables. While this new capability may not (yet) be in widespread deployment, it represents a new tool that could strongly suppress the use of AirDrop in China and Hong Kong.
And this is a big deal, since AirDrop is apparently one of a few channels that can still be used to disseminate unauthorized protest materials — and indeed, that was used in both places in 2019 and 2022, and (allegedly as a result) has already been subject to various curtailments.
In this post I’m going to talk about the Darmstadt research and how it relates to the news out of Beijing. Finally, I’ll talk a little about what Apple can do about it — something that is likely to be as much of a political problem as a technical one.
As always, the rest of this will be in the “fun” question-and-answer format I use for these posts.
What is AirDrop and why should I care?
Image from Apple. Used without permission.
If you own an iPhone, you already know the answer to this question. Otherwise: AirDrop is an Apple-specific protocol that allows Apple devices to send files (and contacts and other stuff) in a peer-to-peer manner over various wireless protocols, including Bluetooth and WiFi.
The key thing to know about AirDrop is that it has two settings, which can be enabled by a potential receiver. In “Contacts Only” mode, AirDrop will accept files only from people who are in your Contacts list (address book.) When set to “Everyone”, AirDrop will receive files from any random person within transmit range. This latter mode has been extensively used to distribute protest materials in China and Hong Kong, as well as to distribute indecent photos to strangers all over the world.
The former usage of AirDrop became such a big deal in protests that in 2022, Apple pushed a software update exclusively to Chinese users that limited the “Everyone” receive-from mode — ensuring that phones would automatically switch back to “Contacts only” after 10 minutes. The company later extended this software update to all users worldwide, but only after they were extensively criticized for the original move.
Is AirDrop supposed to be private? And how does AirDrop know if a user is in their Contacts list?
While AirDrop is not explicitly advertised as an “anonymous” communication protocol, any system that has your phone talking to strangers has implicit privacy concerns baked into it. This drives many choices around how AirDrop works.
Let’s start with the most important one: do AirDrop senders provide their ID to potential recipients? The answer, at some level, must be “yes.”
The reason for this is straightforward. In order for AirDrop recipients in “Contacts only” mode to check that a sender is in their Contacts list, there must be a way for them to check the sender’s ID. This implies that the sender must somehow reveal their identity to the recipient. And since AirDrop presents a list of possible recipients any time a sending user pops up the AirDrop window, this will happen at “discovery” time — typically before you’ve even decided if you really want to send a file.
But this poses a conundrum: the sender’s phone doesn’t actually know which nearby AirDrop users are willing to receive files from it — i.e., which AirDrop users have the sender in their Contacts — and it won’t know this until it actually talks to them. But talking to them means your phone is potentially shouting at everyone around it all the time, saying something like:
Hi there! My Apple ID is [email protected]. Will you accept files from me!??
Now forget that this is being done by phones. Instead imagine yourself, as a human being, doing this to every random stranger you encounter on the subway. It should be obvious that this will quickly become a privacy concern, one that would scare even a company that doesn’t care about privacy. But Apple generally does care quite a bit about privacy!
Thus, just solving this basic problem requires a clever way by which phones can figure out whether they should talk to each other — i.e., whether the receiver has the sender in its Contacts — without either side leaking any useful information to random strangers. Fortunately cryptographic researchers have thought a lot about this problem! We’ve even given it a cool name: it’s called Private Set Intersection, or PSI.
To make a long story short: a Private Set Intersection protocol takes a set of strings from the Sender and a set from the Receiver. It gives one (or both) parties the intersection of both sets: that is, the set of entries that appear on both lists. Most critically, a good PSI protocol doesn’t reveal any other information about either of the sets.
In Apple’s case, the Sender would have just a few entries, since you can have a few different email addresses and phone numbers. The Receiver would have a big set containing its entire Contacts list. The output of the protocol would contain either (1) one or more of the Sender’s addresses, or (2) nothing. A PSI protocol would therefore solve Apple’s problem nicely.
Great, so which PSI protocol does Apple use?
The best possible answer to this is: 😔.
For a variety of mildly defensible reasons — which I will come back to in a moment — Apple does not use a secure PSI protocol to solve their AirDrop problem. Instead they did the thing that every software developer does when faced with the choice of doing complicated cryptography or “hacking something together in time for the next ship date”: they threw together their own solution using hash functions.
The TU Darmstadt researchers did a nice job of reverse-engineering Apple’s protocol in their paper. Read it! The important bit happens during the “Discovery” portion of the protocol, which is marked by an HTTPS POST request as shown in the excerpt below:
The very short TL;DR is this:
In the POST request, a sender attaches a truncated SHA-256 hash of its own Apple ID, which is contained within a signed certificate that it gets from Apple. (If the sender has more than one identifier, e.g., a phone number and an email address, this will contain hashes of each one.)
The recipient then hashes every entry in its Contacts list, and compares the results to see if it finds a match.
If the recipient is in Contacts Only mode and finds a match, it indicates this and accepts later file transfers. Otherwise it aborts the connection.
(As a secondary issue, AirDrop also includes a very short [two byte] portion of the same hashes in its BLE advertisements. Two bytes is pretty tiny, which means this shouldn’t leak much information, since many different addresses will collide on a two-byte hash. However, some other researchers have determined that it generally does work well enough to guess identities. Or they may have, the source isn’t translating well for me.)
A second important issue here is that the hash identifiers are apparently stored in logs within the recipient’s phone, which means that to obtain them you don’t have to be physically present when the transfer happens. You can potentially scoop them out of someone else’s phone after the fact.
So what’s the problem?
Many folks who have some experience with cryptography will see the problem immediately. But let’s be explicit.
Hash functions are designed to be one-way. In theory, this means that there is should be no efficient algorithm for “directly” taking the output of a hash function and turning it back into its input. But that guarantee has a huge asterisk: if I can guess a set of possible inputs that could have produced the hash, I can simply hash each one of my guesses and compare it to the target. If one input matches, then chances are overwhelming that I’ve found the right input (also called a pre-image.)
In its most basic form, this naive approach is called a “dictionary attack” based on the idea that one can assemble a dictionary of likely candidates, then test every one. Since these hashes apparently don’t contain any session-dependent information (such as salt), you can even do the hashing in advance to assemble a dictionary of candidate hashes, making the attack even faster.
This approach won’t work if your Apple ID (or phone number) is not guessable. The big question in exploiting this vulnerability is whether it’s possible to assemble a complete list of candidate Apple ID emails and phone numbers. The answer for phone numbers, as the Darmstadt researchers point out, is absolutely yes. Since there are only a few billion phone numbers, it is entirely possible to make a list of every phone number and have a computer grind through them — given a not-unreasonable amount of time. For email addresses this is more complicated, but there are many lists of email addresses in the world, and the Chinese state authorities almost certainly have some good approaches to collecting and/or generating those lists.
As an aside, exploiting these dictionaries can be done in three different ways:
You can make a list of candidate identifiers (or generate them programmatically) and then, given a new target hash, you can hash each identifier and check for a match. This requires you to compute a whole lot of SHA256 hashes for each target you crack, which is pretty fast on a GPU or FPGA (or ASIC) but not optimal.
You can pre-hash the list and make a database of hashes and identifiers. Then when you see a target hash, you just need to do a fast lookup. This means all computation is done once, and lookups are fast. But it requires a ton of storage.
Alternatively, you can use an intermediate approach called a time-memory tradeoff in which you exchange some storage for some computation once the target is found. The most popular technique is called a rainbow table, and it really deserves its own separate blog post, though I will not elaborate today.
The Chinese announcement explicitly mentions a rainbow table, so that’s a good indicator that they’re exploiting this vulnerability.
Well that sucks. What can we, or rather Apple, do about it?
If you’re worried about leaking your identifier, an immediate solution is to turn off AirDrop, assuming such a thing is possible. (I haven’t tried it, so I don’t know if turning this off will really stop your phone from talking to other people!) Alternatively you can unregister your Apple ID, or use a bizarre high-entropy Apple ID that nobody will possibly guess. Apple could also reduce their use of logging.
But those solutions are all terrible.
The proper technical solution is for Apple to replace their hashing-based protocol with a proper PSI protocol, which will — as previously discussed — reveal only one bit of information: whether the receiver has the sender’s address(es) in their Contacts list. Indeed, that’s the solution that the Darmstadt researchers propose. They even devised a Diffie-Hellman-based PSI protocol called “PrivateDrop” and showed that it can be used to solve this problem.
But this is not necessarily an easy solution, for reasons that are both technical and political. It’s worth noting that Apple almost certainly knew from the get-go that their protocol was vulnerable to these attacks — but even if they didn’t, they were told about these issues back in May 2019 by the Darmstadt folks. It’s now 2024, and Chinese authorities are exploiting it. So clearly it was not an easy fix.
Some of this stems from the fact that PSI protocols are more computationally heavy that the hashing-based protocol, and some of it (may) stem from the need for more interaction between each pair of devices. Although these costs are not particularly unbearable, it’s important to remember that phone battery life and BLE/WiFi bandwidth is precious to Apple, so even minor costs are hard to bear. Finally, Apple may not view this as really being an issue.
However in this case there is an even tougher political dimension.
Will Apple even fix this, given that Chinese authorities are now exploiting it?
And here we find the hundred billion dollar question: if Apple actually replaced their existing protocol with PrivateDrop, would that be viewed negatively by the Chinese government?
Those of us on the outside can only speculate about this. However, the facts are pretty worrying: Apple has enormous manufacturing and sales resources located inside of China, which makes them extremely vulnerable to an irritated Chinese government. They have, in the past, taken actions that appeared to be targeted at restricting AirDrop use within China — and although there’s no definitive proof of their motivations, it certainly looked bad.
Hence there is a legitimate question about whether it’s politically wise for Apple to make a big technical improvement to their AirDrop privacy, right at the moment that the lack of privacy is being viewed as an asset by authorities in China. Even if this attack isn’t really that critical to law enforcement within China, the decision to “fix” it could very well be seen as a slap in the face.
One hopes that despite all these concerns, we’ll soon see a substantial push to improve the privacy of AirDrop. But I’m not going to hold my breath.