While a good TTFB doesn’t necessarily mean you will have a fast website, a bad TTFB almost certainly guarantees a slow one.
Time To First Byte has been the chink in my armour over at thesession.org, especially on the home page. Every time I ran Lighthouse, or some other performance testing tool, I’d get a high score …with some points deducted for taking too long to get that first byte from the server.
Harry’s proposed solution is to set up some Server Timing headers:
With a little bit of extra work spent implementing the Server Timing API, we can begin to measure and surface intricate timings to the front-end, allowing web developers to identify and debug potential bottlenecks previously obscured from view.
The job of Server Timing is not to help you actually time activity on your server. You’ll need to do the timing yourself using whatever toolset your backend platform makes available to you. Rather, the purpose of Server Timing is to specify how those measurements can be communicated to the browser.
He even provides some PHP code, which I was able to take wholesale and drop into the codebase for thesession.org. Then I was able to put start/stop points in my code for measuring how long some operations were taking. Then I could output the results of these measurements into Server Timing headers that I could inspect in the “Network” tab of a browser’s dev tools (Chrome is particularly good for displaying Server Timing, so I used that while I was conducting this experiment).
I started with overall database requests. Sure enough, that was where most of the time in time-to-first-byte was being spent.
Then I got more granular. I put start/stop points around specific database calls. By doing this, I was able to zero in on which operations were particularly costly. Once I had done that, I had to figure out how to make the database calls go faster.
Spoiler: I did it by adding an extra index on one particular table. It’s almost always indexes, in my experience, that make the biggest difference to database performance.
I don’t know why it took me so long to get around to messing with Server Timing headers. It has paid off in spades. I wish I had done it sooner.
On April 29th, 2010, Steve Jobs published his infamous Thoughts on Flash. It thrust the thitherto geek phrase “HTML5” into the mainstream press:
HTML5, the new web standard that has been adopted by Apple, Google and many others, lets web developers create advanced graphics, typography, animations and transitions without relying on third party browser plug-ins (like Flash). HTML5 is completely open and controlled by a standards committee, of which Apple is a member.
Five days later, I announced the first title from A Book Apart: HTML5 For Web Designers. The timing was purely coincidental, but it definitely didn’t hurt that book’s circulation.
For a while now, quite a few people have cited Apple’s lack of support as a reason why they weren’t investigating service workers. That excuse no longer holds water.
I expect not understanding how progressive web apps are built (service workers, manifests, https) will be a skill deficit in 6-12 months, much like not understanding @RWD has been for a few of years.
I also added this list of seven principles of rich web applications to the collection, although they feel a bit more like engineering principles than design principles per se. That said, they’re really, really good. Every single one is rooted in performance and the user’s experience, not developer convenience.
Don’t get me wrong: developer convenience is very, very important. Nobody wants to feel like they’re doing unnecessary work. But I feel very strongly that the needs of the end user should trump the needs of the developer in almost all instances (you may feel differently and that’s absolutely fine; we’ll agree to differ).
That push and pull between developer convenience and user experience is, I think, most evident in the first principle: server-rendered pages are not optional. Now before you jump to conclusions, the author is not saying that you should never do client-side rendering, but instead points out the very important performance benefits of having the server render the initial page. After that—if the user’s browser cuts the mustard—you can use client-side rendering exclusively.
The issue with that hybrid approach—as I’ve discussed before—is that it’s hard. Isomorphic JavaScript (terrible name) can theoretically help here, but I haven’t seen too many examples of it in action. I suspect that’s because this approach doesn’t yet offer enough developer convenience.
Anyway, I found myself nodding along enthusiastically with that first of seven design principles. Then I got to the second one: act immediately on user input. That sounds eminently sensible, and it’s backed up with sound reasoning. But it finishes with:
Techniques like PJAX or TurboLinks unfortunately largely miss out on the opportunities described in this section.
Ah. See, I’m a big fan of PJAX. It’s essentially the same thing as the Hijax technique I talked about many years ago in Bulletproof Ajax, but with the new addition of HTML5’s History API. It’s a quick’n’dirty way of giving the illusion of a fat client: all the work is actually being done in the server, which sends back chunks of HTML that update the interface. But it’s true that, because of that round-trip to the server, there’s a bit of a delay and so you often end up briefly displaying a loading indicator.
I contend that spinners or “loading indicators” should become a rarity
I agree …but I also like using PJAX/Hijax. Now how do I reconcile what’s best for the user experience with what’s best for my own developer convenience?
I’ve come up with a compromise, and you can see it in action on The Session. There are multiple examples of PJAX in action on that site, like pretty much any page that returns paginated results: new tune settings, the latest events, and so on. The steps for initiating an Ajax request used to be:
Listen for any clicks on the page,
If a “previous” or “next” button is clicked, then:
Display a loading indicator,
Request the new data from the server, and
Update the page with the new data.
In one sense, I am acting immediately to user input, because I always display the loading indicator straight away. But because the loading indicator always appears, no matter how fast or slow the server responds, it sometimes only appears very briefly—just for a flash. In that situation, I wonder if it’s serving any purpose. It might even be doing the opposite to its intended purpose—it draws attention to the fact that there’s a round-trip to the server.
“What if”, I asked myself, “I only showed the loading indicator if the server is taking too long to send a response back?”
The updated flow now looks like this:
Listen for any clicks on the page,
If a “previous” or “next” button is clicked, then:
Start a timer, and
Request the new data from the server.
If the timer reaches an upper limit, show a loading indicator.
When the server sends a response, cancel the timer and
Update the page with the new data.
Even though there are more steps, there’s actually less happening from the user’s perspective. Where previously you would experience this:
I click on a button,
I briefly see a loading indicator,
I see the new data.
Now your experience is:
I click on a button,
I see the new data.
…unless the server or the network is taking too long, in which case the loading indicator appears as an interim step.
The question is: how long is too long? How long do I wait before showing the loading indicator?
0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
So I should set my timer to 100 milliseconds. In practice, I found that I can set it to as high as 200 to 250 milliseconds and keep it feeling very close to instantaneous. Anything over that, though, and it’s probably best to display a loading indicator: otherwise the interface starts to feel a little sluggish, and slightly uncanny. (“Did that click do any—? Oh, it did.”)
You can test the response time by looking at some of the simpler pagination examples on The Session: new recordings or new discussions, for example. To see examples of when the server takes a bit longer to send a response, you can try paginating through search results. These take longer because, frankly, I’m not very good at optimising some of those search queries.
There you have it: an interface that—under optimal conditions—reacts to user input instantaneously, but falls back to displaying a loading indicator when conditions are less than ideal. The result is something that feels like a client-side web thang, even though the actual complexity is on the server.