When I went up to London for the State of the Browser conference last month, I shared the train journey with Remy.
I always like getting together with Remy. We usually end up discussing sci-fi books we’re reading, commiserating with one another about conference-organising, discussing the minutiae of browser APIs, or talking about the big-picture vision of the World Wide Web.
On this train ride we ended up talking about the march of time and how death comes for us all …and our websites.
Take The Session, for example. It’s been running for two and a half decades in one form or another. I plan to keep it running for many more decades to come. But I’m the weak link in that plan.
If I get hit by a bus tomorrow, The Session will keep running. The hosting is paid up for a while. The domain name is registered for as long as possible. But inevitably things will need to be updated. Even if no new features get added to the site, someone’s got to install updates to keep the underlying software safe and secure.
Remy and I discussed the long-term prospects for widening out the admin work to more people. But we also discussed smaller steps I could take in the meantime.
Like, there’s the actual content of the website. Now, I currently share exports from the database every week in JSON, CSV, and SQLite. That’s good. But you need to be tech nerd to do anything useful with that data.
The more I talked about it with Remy, the more I realised that HTML would be the most useful format for the most people.
There’s a cute acronym in the world of digital preservation: LOCKSS. Lots Of Copies Keep Stuff Safe. If there were multiple copies of The Session’s content out there in the world, then I’d have a nice little insurance policy against some future catastrophe befalling the live site.
With the seed of the idea planted in my head, I waited until I had some time to dive in and see if this was doable.
Fortunately I had plenty of opportunity to do just that on some other train rides. When I was in Spain and France recently, I spent hours and hours on trains. For some reason, I find train journeys very conducive to coding, especially if you don’t need an internet connection.
By the time I was back home, the code was done. Here’s the result:
The Session archive: a static copy of the content on thesession.org.
If you want to grab a copy for yourself, go ahead and download this .zip file. Be warned that it’s quite large! The .zip file is over two gigabytes in size and the unzipped collection of web pages is almost ten gigabytes. I plan to update the content every week or so.
I’ve put a copy up on Netlify and I’m serving it from the subdomain archive.thesession.org if you want to check out the results without downloading the whole thing.
Because this is a collection of static files, there’s no search. But you can use your browser’s “Find in Page” feature to search within the (very long) index pages of each section of the site.
You don’t need to a web server to click around between the pages: they should all work straight from your file system. Double-clicking any HTML file should give a starting point.
I wanted to reduce the dependencies on each page to as close to zero as I could. All the CSS is embedded in the the page. Likewise with most of the JavaScript (you’ll still need an internet connection to get audio playback and dynamic maps). This keeps the individual pages nice and self-contained. That means they can be shared around (as an email attachment, for example).
I’ve shared this project with the community on The Session and people are into it. If nothing else, it could be handy to have an offline copy of the site’s content on your hard drive for those situations when you can’t access the site itself.