Journal tags: appcache

2

sparkline

Making Resilient Web Design work offline

I’ve written before about taking an online book offline, documenting the process behind the web version of HTML5 For Web Designers. A book is quite a static thing so it’s safe to take a fairly aggressive offline-first approach. In fact, a static unchanging book is one of the few situations that AppCache works for. Of course a service worker is better, but until AppCache is removed from browsers (and until service worker is supported across the board), I’m using both. I wouldn’t recommend that for most sites though—for most sites, use a service worker to enhance it, and avoid AppCache like the plague.

For Resilient Web Design, I took a similar approach to HTML5 For Web Designers but I knew that there was a good chance that some of the content would be getting tweaked at least for a while. So while the approach is still cache-first, I decided to keep the cache fairly fresh.

Here’s my service worker. It starts with the usual stuff: when the service worker is installed, there’s a list of static assets to cache. In this case, that list is literally everything; all the HTML, CSS, JavaScript, and images for the whole site. Again, this is a pattern that works well for a book, but wouldn’t be right for other kinds of websites.

The real heavy lifting happens with the fetch event. This is where the logic sits for what the service worker should do everytime there’s a request for a resource. I’ve documented the logic with comments:

// Look in the cache first, fall back to the network
  // CACHE
  // Did we find the file in the cache?
      // If so, fetch a fresh copy from the network in the background
      // NETWORK
          // Stash the fresh copy in the cache
  // NETWORK
  // If the file wasn't in the cache, make a network request
      // Stash a fresh copy in the cache in the background
  // OFFLINE
  // If the request is for an image, show an offline placeholder
  // If the request is for a page, show an offline message

So my order of preference is:

  1. Try the cache first,
  2. Try the network second,
  3. Fallback to a placeholder as a last resort.

Leaving aside that third part, regardless of whether the response is served straight from the cache or from the network, the cache gets a top-up. If the response is being served from the cache, there’s an additional network request made to get a fresh copy of the resource that was just served. This means that the user might be seeing a slightly stale version of a file, but they’ll get the fresher version next time round.

Again, I think this acceptable for a book where the tweaks and changes should be fairly minor, but I definitely wouldn’t want to do it on a more dynamic site where the freshness matters more.

Here’s what it usually likes like when a file is served up from the cache:

caches.match(request)
  .then( responseFromCache => {
  // Did we find the file in the cache?
  if (responseFromCache) {
      return responseFromCache;
  }

I’ve introduced an extra step where the fresher version is fetched from the network. This is where the code can look a bit confusing: the network request is happening in the background after the cached file has already been returned, but the code appears before the return statement:

caches.match(request)
  .then( responseFromCache => {
  // Did we find the file in the cache?
  if (responseFromCache) {
      // If so, fetch a fresh copy from the network in the background
      event.waitUntil(
          // NETWORK
          fetch(request)
          .then( responseFromFetch => {
              // Stash the fresh copy in the cache
              caches.open(staticCacheName)
              .then( cache => {
                  cache.put(request, responseFromFetch);
              });
          })
      );
      return responseFromCache;
  }

It’s asynchronous, see? So even though all that network code appears before the return statement, it’s pretty much guaranteed to complete after the cache response has been returned. You can verify this by putting in some console.log statements:

caches.match(request)
.then( responseFromCache => {
  if (responseFromCache) {
      event.waitUntil(
          fetch(request)
          .then( responseFromFetch => {
              console.log('Got a response from the network.');
              caches.open(staticCacheName)
              .then( cache => {
                  cache.put(request, responseFromFetch);
              });
          })
      );
      console.log('Got a response from the cache.');
      return responseFromCache;
  }

Those log statements will appear in this order:

Got a response from the cache.
Got a response from the network.

That’s the opposite order in which they appear in the code. Everything inside the event.waitUntil part is asynchronous.

Here’s the catch: this kind of asynchronous waitUntil hasn’t landed in all the browsers yet. The code I’ve written will fail.

But never fear! Jake has written a polyfill. All I need to do is include that at the start of my serviceworker.js file and I’m good to go:

// Import Jake's polyfill for async waitUntil
importScripts('/js/async-waituntil.js');

I’m also using it when a file isn’t found in the cache, and is returned from the network instead. Here’s what the usual network code looks like:

fetch(request)
  .then( responseFromFetch => {
    return responseFromFetch;
  })

I want to also store that response in the cache, but I want to do it asynchronously—I don’t care how long it takes to put the file in the cache as long as the user gets the response straight away.

Technically, I’m not putting the response in the cache; I’m putting a copy of the response in the cache (it’s a stream, so I need to clone it if I want to do more than one thing with it).

fetch(request)
  .then( responseFromFetch => {
    // Stash a fresh copy in the cache in the background
    let responseCopy = responseFromFetch.clone();
    event.waitUntil(
      caches.open(staticCacheName)
      .then( cache => {
          cache.put(request, responseCopy);
      })
    );
    return responseFromFetch;
  })

That all seems to be working well in browsers that support service workers. For legacy browsers, like Mobile Safari, there’s the much blunter caveman logic of an AppCache manifest.

Here’s the JavaScript that decides whether a browser gets the service worker or the AppCache:

if ('serviceWorker' in navigator) {
  // If service workers are supported
  navigator.serviceWorker.register('/serviceworker.js');
} else if ('applicationCache' in window) {
  // Otherwise inject an iframe to use appcache
  var iframe = document.createElement('iframe');
  iframe.setAttribute('src', '/appcache.html');
  iframe.setAttribute('style', 'width: 0; height: 0; border: 0');
  document.querySelector('footer').appendChild(iframe);
}

Either way, people are making full use of the offline nature of the book and that makes me very happy indeed.

Taking an online book offline

Application Cache is—as Jake so infamously described—not a good API. It was specced and shipped before developers had a chance to figure out what they really needed, and so AppCache turned out to be frustrating at best and downright dangerous in some situations. Its over-zealous caching combined with its byzantine cache invalidation ensured it was never going to become a mainstream technology.

There are very few use-cases for AppCache, but I think I hit upon one of them. Six years ago, A Book Apart published HTML5 For Web Designers. A year and a half later, I put the book online. The contents are never going to change. There’s a second edition of the book out now but if you want to read all the extra bits that Rachel added, you’re going to have to buy the book. The website for the original book is static and unchanging. That’s what made it such a good candidate for using AppCache. I could just set it and forget.

Except that’s no longer true. AppCache is being deprecated and browsers are starting to withdraw support. Chrome is already making sure that AppCache—like geolocation—no longer works on sites that aren’t served over HTTPS. That’s for the best. In retrospect, those APIs should never have been allowed over unsecured HTTP.

I mentioned that I spent the weekend switching all my book websites over to HTTPS, so AppCache should continue to work …for now. It’s only a matter of time before AppCache is removed completely from many of the browsers that currently support it.

Seeing as I’ve got the HTML5 For Web Designers site running on HTTPS now, I might as well go all out and make it a progressive web app. By far the biggest barrier to making a progressive web app is that first step of setting up HTTPS. It’s gotten cheaper—thanks to Let’s Encrypt Certbot—but it still involves mucking around in the command line with root access; I never wanted to become a sysadmin. But once that’s finally all set up, the other technological building blocks—a Service Worker and a manifest file—are relatively easy.

In this case, the Service Worker is using a straightforward bit of logic:

  • On installation, cache absolutely everything: HTML, CSS, images.
  • When anything is requested, grab it from the cache.
  • If it isn’t in the cache, try the network.
  • If the network doesn’t work, show an offline page (or image).

Basically I’m reproducing AppCache’s overzealous approach. It works for this site because the content is never going to change. I hope that this time, I really can just set it and forget it. I want the site to be an historical artefact, available at the same URL for at least my lifetime. I don’t want to have to maintain it or revisit it every few years to swap out one API for another.

Which brings me back to the way AppCache is being deprecated…

The Firefox team are very eager to ditch AppCache as soon as possible. On the one hand, that’s commendable. They’re rightly proud of shipping Service Workers and they want to encourage people to use the better technology instead. But it sure stings for the suckers (like me) who actually went and built stuff using AppCache.

In a weird way, I think this rush to deprecate AppCache might actually hurt the adoption of Service Workers. Let me explain…

At last year’s Edge Conference, Nolan Lawson gave a great presentation on storing data in the browser. He enumerated the many ways—past and present—that we could store data locally: WebSQL, Local Storage, IndexedDB …the list goes on. He also posed the question: why aren’t more people using insert-name-of-latest-API-here? To me it seemed obvious why more people weren’t diving into using the latest and greatest option for local data storage. It was because they had been burned before. The developers who rushed into trying previous solutions end up being mocked for their choice. “Still using that ol’ thing? Pffftt!”

You can see that same attitude on display from Mozilla as they push towards removing AppCache. Like in a comment that refers to developers using AppCache in production as “the angry hordes”. Reminds me of something Tom said:

In that same Mozilla thread, Soledad echoes Tom’s point:

As a member of the devrel team: I think that this should be better addressed in a blog post that someone from the team responsible for switching AppCache off should write, so everyone can understand the reasons and ask questions to those people.

I’d rather warn people beforehand, pointing them to that post and help them with migration paths than apply emergency mitigation strategies when a lot of people find their stuff stopped working in the newer Firefox…

Bravo! That same approach should have also been taken by the Chrome team when it came to their thread about punishing display:browser in manifest files. There was absolutely no communication with developers about this major decision. I only found out about it because Paul happened to mention it to me.

I was genuinely shocked by this:

Withholding the “add to home screen” prompt like that has a whiff of blackmail about it.

I can confirm that smell. When I was making the manifest file for HTML5 For Web Designers, I really wanted to put display: browser because I want people to be able to copy and paste URLs (for the book, for individual chapters, and for sections within chapters). But knowing that if I did that, Android users would never see the “add to home screen” prompt made me question that decision. I felt strong-armed into declaring display: standalone. And no, I’m not mollified by hand-waving reassurances that the Chrome team will figure out some solution for this. Figure out the solution first, then punish the saps like me who want to use display: browser to allow people to share URLs.

Anyway, the website for HTML5 For Web Designers is now using AppCache and Service Workers. The AppCache part will probably be needed for quite a while yet to provide offline support on iOS. Apple are really dragging their heels on Service Worker support, with at least one WebKit engineer actively looking for reasons not to implement it.

There’s a lot of talk about making apps work offline, but I think it’s just as important that we consider making information work offline. Books are a great example of this. To use the tired transport tropes, the website for a book is something you might genuinely want to access when you’re on a plane, or in the underground, or out at sea.

I really, really like progressive web apps. But I also think it’s important that we don’t fall into the trap of just trying to imitate native apps on the web. I love the idea of taking the best of the web—like information being permanently available at a URL—and marrying that up with the best of native—like offline access. I also like the idea of taking the best of books—a tome of thought—and marrying it up with the best of the web—hypertext.

I’d love to see more experimentation around online/offline hypertext/books. For now, you can visit HTML5 For Web Designers, add it to your home screen, and revisit it whenever and wherever you like.