Digital forgeries are hard
Mar. 14th, 2024 12:30 am
mjg59
Closing arguments in the trial between various people and Craig Wright over whether he's Satoshi Nakamoto are wrapping up today, amongst a bewildering array of presented evidence. But one utterly astonishing aspect of this lawsuit is that expert witnesses for both sides agreed that much of the digital evidence provided by Craig Wright was unreliable in one way or another, generally including indications that it wasn't produced at the point in time it claimed to be. And it's fascinating reading through the subtle (and, in some cases, not so subtle) ways that that's revealed.
One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that's been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they'd be strong evidence linking him to the mining of coins before Bitcoin's public availability. Unfortunately the screenshots themselves weren't contemporary - the metadata shows them being created in 2020. This wouldn't fundamentally be a problem (it's entirely reasonable to create new screenshots of old material), as long as it's possible to establish that the material shown in the screenshots was created at that point. Sadly, well.
One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted - this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn't exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn't line up with calendar time - an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.
This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 - before the metadata associated with the screenshots. The implication isn't entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they'd received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they'd received a copy of the email that was submitted - but they'd received it a few days earlier. In 2024.
And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer - and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers - consistent with the mail being sent by a computer with the clock set to 2019.
But there's a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic - the format simply doesn't match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.
This is by no means the only indication that Craig Wright's evidence may be misleading (there's the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it's written in OpenOffice, given that's what the metadata claims), but it's a lovely example of a more general issue.
Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It's fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you're going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it's anachronistic. I'm pretty good at computers and I no longer have any belief I could fake an old document.
(References: this Dropbox, under "Expert reports", "Patrick Madden". Initial MYOB data is in "Appendix PM7", further analysis is in "Appendix PM42", email analysis is "Sixth Expert Report of Mr Patrick Madden")
One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that's been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they'd be strong evidence linking him to the mining of coins before Bitcoin's public availability. Unfortunately the screenshots themselves weren't contemporary - the metadata shows them being created in 2020. This wouldn't fundamentally be a problem (it's entirely reasonable to create new screenshots of old material), as long as it's possible to establish that the material shown in the screenshots was created at that point. Sadly, well.
One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted - this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn't exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn't line up with calendar time - an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.
This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 - before the metadata associated with the screenshots. The implication isn't entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they'd received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they'd received a copy of the email that was submitted - but they'd received it a few days earlier. In 2024.
And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer - and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers - consistent with the mail being sent by a computer with the clock set to 2019.
But there's a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic - the format simply doesn't match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.
This is by no means the only indication that Craig Wright's evidence may be misleading (there's the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it's written in OpenOffice, given that's what the metadata claims), but it's a lovely example of a more general issue.
Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It's fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you're going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it's anachronistic. I'm pretty good at computers and I no longer have any belief I could fake an old document.
(References: this Dropbox, under "Expert reports", "Patrick Madden". Initial MYOB data is in "Appendix PM7", further analysis is in "Appendix PM42", email analysis is "Sixth Expert Report of Mr Patrick Madden")
makingnold document
Date: 2024-03-14 01:36 pm (UTC)I think I could make some old looking .doc files for a coin.
Re: makingnold document
Date: 2024-03-14 03:17 pm (UTC)But could you be certain that you'd caught everything anyone might check, when a lot of people are motivated to catch you out? It's happened enough times that I have to say that although I'm also fairly confident of being able to fool a lay-person, I'm not entirely sure I'd be able to make a forgery that I couldn't detect, let alone fool someone with experience in such things.
Are you sure that the copies of the fonts that you're using have the same kerning as the originals? That no-one will use radiocarbon dating to discover that your old printed copies aren't so old? That your digital copies have enough metadata to be convincing, but none that's anachronistic?
Re: makingnold document
Date: 2024-03-14 06:45 pm (UTC)Re: makingnold document
Date: 2024-03-14 06:51 pm (UTC)1. You go to a fleamarket, pick up an Intel 486 PC.
2. Set the date to what you want in BIOS. (The real-time clock's battery will likely have died so it won't know the current date)
3. Install Windows 98. It will have original fonts etc. No networking ofc.
4. Reproduce your document...
Re: makingnold document
Date: 2024-03-14 09:41 pm (UTC)I'm with you. With the (obectively delusional, of course) prospect of billions of dollars on the line, it would certainly be /possible/ to put in enough effort to create an at least /much more/ waterproof forgery, even if it's hardly going to be a sufficiently convincing argument to actually achieve his goal.
That's what I find so funny about Craig's clownish efforts, right from the very start: in retrospective, all his supposed proofs are kinda low-effort, consistent with him ofc being literate with computers, but in the end just being a pathological liar that can't help but /just never/ going all the way. Time and again he's expecting everybody to be stupid enough to fall for his charades.
By now the sheer balls of it have become his legacy, and I expect one day all of this will make for a great movie.
Re: makingnold document
Date: 2024-03-14 09:46 pm (UTC)Re: makingnold document
Date: 2024-03-15 12:53 pm (UTC)He'd been given over a thousand pages showing how to catch someone out faking documents.. and still got caught faking new ones.
The images "from 2007/08" were a classic example: they were only "found" after his initial set of disclosure documents had the experts on both sides pointing out how they were problematic in their reports, and a pile of the metadata on the images showed dates between those reports and the images' disclosure.
Oh, and the way that the PDF to LaTeX converter he denied using clearly used way too many significant figures in some of its maths, so he was sitting there saying that he'd hand set a pile of character spacings with the precision of less than a tiny fraction of the width of a hair.. and was then confronted by an animation of him trying and failing and failing and failing to get the line breaks right, all because editing metadata showed the series of changes made to the file.
Even if he'd used a virtual machine, it could have been interesting that something in the metadata said that the PC the document was produced on just happened to match the PC emulated by a recent version of VMware / VirtualBox.
Re: makingnold document
Date: 2024-05-27 10:51 pm (UTC)there is only 1 source of truth.
Date: 2024-03-14 08:34 pm (UTC)Re: there is only 1 source of truth.
Date: 2024-03-15 12:56 pm (UTC)Mind you, he only said that after faking having them and being caught out.
no subject
Date: 2024-03-14 08:36 pm (UTC)Quick link to one of the more prominent cases:
https://slate.com/technology/2017/07/the-font-calibri-is-playing-a-surprising-role-in-a-pakistani-scandal.html (https://slate.com/technology/2017/07/the-font-calibri-is-playing-a-surprising-role-in-a-pakistani-scandal.html)
no subject
Date: 2024-03-15 08:15 am (UTC)Our clients have the same challenge. More than once, I've investigated bugs, and it's been very obvious from the data that we're either dealing with time travel, or someone screwing around with SQL. Sequences out of order, or records that couldn't possibly have come from the software itself... those fingerprints can be pretty obvious.
no subject
Date: 2024-03-27 03:55 pm (UTC)