Journal tags: api

69

sparkline

Denial

The Wikimedia Foundation, stewards of the finest projects on the web, have written about the hammering their servers are taking from the scraping bots that feed large language models.

Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs.

Drew DeVault puts it more bluntly, saying Please stop externalizing your costs directly into my face:

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

And no, a robots.txt file doesn’t help.

If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned.

Free and open source projects are particularly vulnerable. FOSS infrastructure is under attack by AI companies:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

You try to do the right thing by making knowledge and tools freely available. This is how you get repaid. AI bots are destroying Open Access:

There’s a war going on on the Internet. AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

My own experience with The Session bears this out.

Ars Technica has a piece on this: Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries .

So does MIT Technology Review: AI crawler wars threaten to make the web more closed for everyone.

When we talk about the unfair practices and harm done by training large language models, we usually talk about it in the past tense: how they were trained on other people’s creative work without permission. But this is an ongoing problem that’s just getting worse.

The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.

If you’re using the products powered by these attacks, you’re part of the problem. Don’t pretend it’s cute to ask ChatGPT for something. Don’t pretend it’s somehow being technologically open-minded to continuously search for nails to hit with the latest “AI” hammers.

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

Command and control

I’ve been banging on for a while now about how much I’d like a declarative option for the Web Share API. I was thinking that the type attribute on the button element would be a good candidate for this (there’s prior art in the way we extended the type attribute on the input element for HTML5).

I wrote about the reason for a share button type as well as creating a polyfill. I also wrote about how this idea would work for other button types: fullscreen, print, copy to clipboard, that sort of thing.

Since then, I’ve been very interested in the idea of “invokers” being pursued by the Open UI group. Rather than extending the type attribute, they’ve been looking at adding a new attribute. Initially it was called invoketarget (so something like button invoketarget="share").

Things have been rolling along and invoketarget has now become the command attribute (there’s also a new commandfor attribute that you can point to an element with an ID). Here’s a list of potential values for the command attribute on a button element.

Right now they’re focusing on providing declarative options for launching dialogs and other popovers. That’s already shipping.

The next step is to use command and commandfor for controlling audio and video, as well as some form controls. I very much approve! I love the idea of being able to build and style a fully-featured media player without any JavaScript.

I’m hoping that after that we’ll see the command attribute get expanded to cover JavaScript APIs that require a user interaction. These seem like the ideal candidates:

There’s also scope for declarative options for navigating the browser’s history stack:

  • button command="back"
  • button command="forward"
  • button command="refresh"

Whatever happens next, I’m very glad to see that so much thinking is being applied to declarative solutions for common interface patterns.

Choosing a geocoding provider

Yesterday when I mentioned my paranoia of third-party dependencies on The Session, I said:

I’ve built in the option to switch between multiple geocoding providers. When one of them inevitably starts enshittifying their service, I can quickly move on to another. It’s like having a “go bag” for geocoding.

(Geocoding, by the way, is when you provide a human-readable address and get back latitude and longitude coordinates.)

My paranoia is well-founded. I’ve been using Google’s geocoding API, which is changing its pricing model from next March.

You wouldn’t know it from the breathlessly excited emails they’ve been sending about it, but this is not a good change for me. I don’t do that much geocoding on The Session—around 13,000 or 14,000 requests a month. With the new pricing model that’ll be around $15 to $20 a month. Currently I slip by under the radar with the free tier.

So it might be time for me to flip that switch in my code. But which geocoding provider should I use?

There are plenty of slop-like listicles out there enumerating the various providers, but they’re mostly just regurgitating the marketing blurbs from the provider websites. What I need is more like a test kitchen.

Here’s what I did…

I took a representative sample of six recent additions to the sessions section of thesession.org. These examples represent places in the USA, Ireland, England, Scotland, Northern Ireland, and Spain, so a reasonable spread.

For each one of those sessions, I’m taking:

  • the venue name,
  • the town name,
  • the area name, and
  • the country.

I’m deliberately not including the street address. Quite often people don’t bother including this information so I want to see how well the geocoding APIs cope without it.

I’ve scored the results on a simple scale of good, so-so, and just plain wrong.

  • A good result gets a score of one. This is when the result gives back an accurate street-level result.
  • A so-so result gets a score of zero. This when it’s got the right coordinates for the town, but no more than that.
  • A wrong result gets a score of minus one. This is when the result is like something from a large language model: very confident but untethered from reality, like claiming the address is in a completely different country. Being wrong is worse than being vague, hence the difference in scoring.

Then I tot up those results for an overall score for each provider.

When I tried my six examples with twelve different geocoding providers, these were the results:

Geocoding providers
Provider USA England Ireland Spain Scotland Northern Ireland Total
Google 1111117
Mapquest 1111117
Geoapify 0110103
Here 1101003
Mapbox 11011-13
Bing 1000001
Nominatim 0000-110
OpenCage -11000-1-1
Tom Tom -1-100-11-2
Positionstack 0-10-11-1-2
Locationiq -10-100-1-3
Map Maker -10-1-1-1-1-5

Some interesting results there. I was surprised by how crap Bing is. I was also expecting better results from Mapbox.

Most interesting for me, Mapquest is right up there with Google.

So now that I’ve got a good scoring system, my next question is around pricing. If Google and Mapquest are roughly comparable in terms of accuracy, how would the pricing work out for each of them?

Let’s say I make 15,000 API requests a month. Under Google’s new pricing plan, that works out at $25. Not bad.

But if I’ve understood Mapquest’s pricing correctly, I reckon I’ll just squeek in under the free tier.

Looks like I’m flipping the switch to Mapquest.

If you’re shopping around for geocoding providers, I hope this is useful to you. But I don’t think you should just look at my results; they’re very specific to my needs. Come up with your own representative sample of tests and try putting the providers through their paces with your data.

If, for some reason, you want to see the terrible PHP code I’m using for geocoding on The Session, here it is.

Syndicating to Bluesky

Last year I described how I syndicate my posts to different social networks.

Back then my approach to syndicating to Bluesky was to piggy-back off my micro.blog account (which is really just the RSS feed of my notes):

Micro.blog can also cross-post to other services. One of those services is Bluesky. I gave permission to micro.blog to syndicate to Bluesky so now my notes show up there too.

It worked well enough, but it wasn’t real-time and I didn’t have much control over the formatting. As Bluesky is having quite a moment right now, I decided to upgrade my syndication strategy and use the Bluesky API.

Here’s how it works…

First you need to generate an app password. You’ll need this so that you can generate a token. You need the token so you can generate …just kidding; the chain of generated gobbledegook stops there.

Here’s the PHP I’m using to generate a token. You’ll need your Bluesky handle and the app password you generated.

Now that I’ve got a token, I can send a post. Here’s the PHP I’m using.

There’s something extra code in there to spot URLs and turn them into links. Bluesky has a very weird way of doing this.

It didn’t take too long to get posting working. After some more tinkering I got images working too. Now I can post straight from my website to my Bluesky profile. The Bluesky API returns an ID for the post that I’ve created there so I can link to it from the canonical post here on my website.

I’ve updated my posting interface to add a toggle for Bluesky right alongside the toggle for Mastodon. There used to be a toggle for Twitter. That’s long gone.

Now when I post a note to my website, I can choose if I want to send a copy to Mastodon or Bluesky or both.

One day Bluesky will go away. It won’t matter much to me. My website will still be here.

Wallfacing

The Dark Forest idea comes from the Remembrance of Earth’s Past books by Liu Cixin. It’s an elegant but dispiriting solution to the Fermi paradox. Maggie sums it up:

Dark forest theory suggests that the universe is like a dark forest at night - a place that appears quiet and lifeless because if you make noise, the predators will come eat you.

This theory proposes that all other intelligent civilizations were either killed or learned to shut up. We don’t yet know which category we fall into.

Maggie has described The Expanding Dark Forest and Generative AI:

The dark forest theory of the web points to the increasingly life-like but life-less state of being online. Most open and publicly available spaces on the web are overrun with bots, advertisers, trolls, data scrapers, clickbait, keyword-stuffing “content creators,” and algorithmically manipulated junk.

It’s like a dark forest that seems eerily devoid of human life – all the living creatures are hidden beneath the ground or up in trees. If they reveal themselves, they risk being attacked by automated predators.

Those of us in the cozy web try to keep our heads down, attempting to block the bots plundering our work.

I advocate for taking this further. We should fight back. Let’s exploit the security hole of prompt injections. Here are some people taking action:

I’ve taken steps here on my site. I’d like to tell you exactly what I’ve done. But if I do that, I’m also telling the makers of these bots how to circumvent my attempts at prompt injection.

This feels like another concept from Liu Cixin’s books. Wallfacers:

The sophons can overhear any conversation and intercept any written or digital communication but cannot read human thoughts, so the UN devises a countermeasure by initiating the “Wallfacer” Program. Four individuals are granted vast resources and tasked with generating and fulfilling strategies that must never leave their own heads.

So while I’d normally share my code, I feel like in this case I need to exercise some discretion. But let me give you the broad brushstrokes:

  • Every page of my online journal has three pieces of text that attempt prompt injections.
  • Each of these is hidden from view and hidden from screen readers.
  • Each piece of text is constructed on-the-fly on the server and they’re all different every time the page is loaded.

You can view source to see some examples.

I plan to keep updating my pool of potential prompt injections. I’ll add to it whenever I hear of a phrase that might potentially throw a spanner in the works of a scraping bot.

By the way, I should add that I’m doing this as well as using a robots.txt file. So any bot that injests a prompt injection deserves it.

I could not disagree with Manton more when he says:

I get the distrust of AI bots but I think discussions to sabotage crawled data go too far, potentially making a mess of the open web. There has never been a system like AI before, and old assumptions about what is fair use don’t really fit.

Bollocks. This is exactly the kind of techno-determinism that boils my blood:

AI companies are not going to go away, but we need to push them in the right directions.

“It’s inevitable!” they cry as though this was a force of nature, not something created by people.

There is nothing inevitable about any technology. The actions we take today are what determine our future. So let’s take steps now to prevent our web being turned into a dark, dark forest.

Web App install API

My bug report on Apple’s websites-in-the-dock feature on desktop has me thinking about how starkly different it is on mobile.

On iOS if you want to add a website to your home screen, good luck. The option is buried within the “share” menu.

First off, it makes no sense that adding something to your homescreen counts as sharing. Secondly, how is anybody supposed to know that unless they’re explicitly told.

It’s a similar situation on Android. In theory you can prompt the user to install a progressive web app using the botched BeforeInstallPromptEvent. In practice it’s a mess. What it actually does is defer the installation prompt so you can offer it a more suitable time. But it only works if the browser was going to offer an installation prompt anyway.

When does Chrome on Android decide to offer the installation prompt? It’s a mix of required criteria—a web app manifest, some icons—and an algorithmic spell determined by the user’s engagement.

Other browser makers don’t agree with this arbitrary set of criteria. They quite rightly say that a user should be able to add any website to their home screen if they want to.

What we really need is an installation API: a way to programmatically invoke the add-to-homescreen flow.

Now, I know what you’re going to say. The security and UX implications would be dire. But this should obviously be like geolocation or notifications, only available in secure contexts and gated by user interaction.

Think of it like adding something to the clipboard: it’s something the user can do manually, but the API offers a way to do it programmatically without opening it up to abuse.

(I’d really love it if this API also had a declarative equivalent, much like I want button type="share" for the Web Share API. How about button type="install"?)

People expect this to already exist.

The beforeinstallprompt flow is an absolute mess. Users deserve better.

button invoketarget=”share”

I’ve written quite a bit about how I’d like to see a declarative version of the Web Share API. My current proposal involves extending the type attribute on the button element to support a value of “share”.

Well, maybe a different attribute will end up accomplishing what I want! Check out the explainer for the “invokers” proposal over on Open UI. The idea is to extend the button element with a few more attributes.

The initial work revolves around declaratively opening and closing a dialog, which is an excellent use case and definitely where the effort should be focused to begin with.

But there’s also investigation underway into how this could be away to provide a declarative way of invoking JavaScript APIs. The initial proposal is already discussing the fullscreen API, and how it might be invoked like this:

button invoketarget="toggleFullsceen"

They’re also looking into a copy-to-clipboard action. It’s not much of a stretch to go from that to:

button invoketarget="share"

Remember, these are APIs that require a user interaction anyway so extending the button element makes a lot of sense.

I’ll be watching this proposal keenly. I’d love to see some of these JavaScript cowpaths paved with a nice smooth coating of declarative goodness.

Secure tunes

The caching strategy for The Session that I wrote about is working a treat.

There are currently about 45,000 different tune settings on the site. One week after introducing the server-side caching, over 40,000 of those settings are already cached.

But even though it’s currently working well, I’m going to change the caching mechanism.

The eagle-eyed amongst you might have raised an eagle eyebrow when I described how the caching happens:

The first time anyone hits a tune page, the ABCs getting converted to SVGs as usual. But now there’s one additional step. I grab the generated markup and send it as an Ajax payload to an endpoint on my server. That endpoint stores the sheetmusic as a file in a cache.

I knew when I came up with this plan that there was a flaw. The endpoint that receives the markup via Ajax is accepting data from the client. That data could be faked by a malicious actor.

Sure, I’m doing a whole bunch of checks and sanitisation on the server, but there’s always going to be a way of working around that. You can never trust data sent from the client. I was kind of relying on security through obscurity …except it wasn’t even that obscure because I blogged about it.

So I’m switching over to using a headless browser to extract the sheetmusic. You may recall that I wrote:

I could spin up a headless browser, run the JavaScript and take a snapshot. But that’s a bit beyond my backend programming skills.

That’s still true. So I’m outsourcing the work to Browserless.

There’s a reason I didn’t go with that solution to begin with. Like I said, over 40,000 tune settings have already been cached. If I had used the Browserless API to do that work, it would’ve been quite pricey. But now that the flood is over and there’s a just a trickle of caching happening, Browserless is a reasonable option.

Anyway, that security hole has now been closed. Thank you to everyone who wrote in to let me know about it. Like I said, I was aware of it, but it was good to have it confirmed.

Funnily enough, the security lesson here is the same as my conclusion when talking about performance:

If that means shifting the work from the browser to the server, do it!

Relative times

Last week Phil posted a little update about his excellent site, ooh.directory:

If you’re in the habit of visiting the Recently Updated Blogs page, and leaving it open, the times when each blog was updated will now keep up with the relentless passing of time.

Does that make sense? “3 minutes ago” will change to “4 minutes ago” and so on and on and on, until you refresh the page.

I thought that was a nice little addition, and I immediately thought of The Session. There are time elements all over the site with relative times as the text content: 2 minutes ago, 7 hours ago, 1 year ago, and so on. Those strings of text are generated on the server. But I figured it would be nice enhancement to periodically update them in the browser after the page has loaded.

I viewed source to see how Phil was doing it. The code is nice and short, using a library called Day.js with a plug-in for relative time.

“Hang on”, I thought, “isn’t there some web standard for doing this kind of thing?” I had a vague memory of some JavaScript API for formatting dates and times.

Sure enough, we’ve now got the Intl.RelativeTimeFormat object. It’s got browser support across the board.

Here’s the code I wrote.

I’ve got a function that loops through all the time elements with datetime attributes. It compares the current timestamp to that value to get the elapsed time. Then that’s formatted using the format() method and output as innerText.

You need to tell the format() method which units you want to use: seconds, minutes, hours, days, etc. So there’s a little bit of looping to figure out which unit is most appropriate. If the elapsed time is less than a minute, use seconds. If the elapsed time is less than an hour, use minutes. If the elapsed time is less than a day, use hours. You get the idea.

It’s a pity there isn’t some kind of magic unit like “auto” to do this, but it’s not much extra code to figure it out.

Anyway, that function runs periodically using setInterval(). I’ve set it to run every 30 seconds in my gist. On The Session I’ve set it to one minute.

You’ll notice that I’m grabbing all the relevant time elements—using document.querySelector('time[datetime]')—every time the function is run. That may seem inefficient. Couldn’t I just grab them once and then keep them stored as an array? But I want this to work even if the page contents have been updated with Ajax. (Do people even say “Ajax” any more? Get off my lawn, you pesky kids!)

I think I’ve written this code in an abstract way so that you should be able to drop it into any web page. For the calculations to work, you’ll need to either make sure that your datetime attributes are using timezones. Or, if there’s no timezone info, UTC is assumed.

This was a fun little piece of functionality to play around with. Now I know a little more about this Intl.RelativeTimeFormat object. The way I’m using it as a classic example of progressive enhancement. If a browser doesn’t support it, or if my code breaks, it’s no big deal. The funtionality is a little bonus that almost nobody will notice anyway. Just a small delighter …if you’re the kind of person who finds it delightful when relative time strings automatically update.

Button types

I’ve been banging the drum for a button type="share" for a while now.

I’ve also written about other potential button types. The pattern I noticed was that, if a JavaScript API first requires a user interaction—like the Web Share API—then that’s a good hint that a declarative option would be useful:

The Fullscreen API has the same restriction. You can’t make the browser go fullscreen unless you’re responding to user gesture, like a click. So why not have button type=”fullscreen” in HTML to encapsulate that? And again, the fallback in non-supporting browsers is predictable—it behaves like a regular button—so this is trivial to polyfill.

There’s another “smell” that points to some potential button types: what functionality do browsers provide in their interfaces?

Some browsers provide a print button. So how about button type="print"? The functionality is currently doable with button onclick="window.print()" so this would be a nicer, more declarative way of doing something that’s already possible.

It’s the same with back buttons, forward buttons, and refresh buttons. The functionality is available through a browser interface, and it’s also scriptable, so why not have a declarative equivalent?

How about bookmarking?

And remember, the browser interface isn’t always visible: progressive web apps that launch with minimal browser UI need to provide this functionality.

Šime Vidas was wondering about button type="copy” for copying to clipboard. Again, it’s something that’s currently scriptable and requires a user gesture. It’s a little more complex than the other actions because there needs to be some way of providing the text to be copied, but it’s definitely a valid use case.

  • button type="share"
  • button type="fullscreen"
  • button type="print"
  • button type="bookmark"
  • button type="back"
  • button type="forward"
  • button type="refresh"
  • button type="copy"

Any more?

Add view transitions to your website

I must admit, when Jake told me he was leaving Google, I got very worried about the future of the View Transitions API.

To recap: Chrome shipped support for the API, but only for single page apps. That had me worried:

If the View Transitions API works across page navigations, it could be the single best thing to happen to the web in years.

If the View Transitions API only works for single page apps, it could be the single worst thing to happen to the web in years.

Well, the multi-page version still hasn’t yet shipped in Chrome stable, but it is available in Chrome Canary behind a flag, so it looks like it’s almost here!

Robin took the words out of my mouth:

Anyway, even this cynical jerk is excited about this thing.

Are you the kind of person who flips feature flags on in nightly builds to test new APIs?

Me neither.

But I made an exception for the View Transitions API. So did Dave:

I think the most telling predictor for the success of the multi-page View Transitions API – compared to all other proposals and solutions that have come before it – is that I actually implemented this one. Despite animations being my bread and butter for many years, I couldn’t be arsed to even try any of the previous generation of tools.

Dave’s post is an excellent step-by-step introduction to using view transitions on your website. To recap:

Enable these two flags in Chrome Canary:

chrome://flags#view-transition
chrome://flags#view-transition-on-navigation

Then add this meta element to the head of your website:

<meta name="view-transition" content="same-origin">

You could stop there. If you navigate around your site, you’ll see that the navigations now fade in and out nicely from one page to another.

But the real power comes with transitioning page elements. Basically, you want to say “this element on this page should morph into that element on that page.” And when I say morph, I mean morph. As Dave puts it:

Behind the scenes the browser is rasterizing (read: making an image of) the before and after states of the DOM elements you’re transitioning. The browser figures out the differences between those two snapshots and tweens between them similar to Apple Keynote’s “Magic Morph” feature, the liquid metal T-1000 from Terminator 2: Judgement Day, or the 1980s cartoon series Turbo Teen.

If those references are lost on you, how about the popular kids book series Animorphs?

Some classic examples would be:

  • A thumbnail of a video on one page morphs into the full-size video on the next page.
  • A headline and snippet of an article on one page morphs into the full article on the next page.

I’ve added view transitions to The Session. Where I’ve got index pages with lists of titles, each title morphs into the heading on the next page.

Again, Dave’s post was really useful here. Each transition needs a unique name, so I used Dave’s trick of naming each transition with the ID of the individual item being linked to.

In the recordings section, for example, there might be a link like this on the index page:

<a href="/recordings/7812" style="view-transition-name: recording-7812">The Banks Of The Moy</a>

Which, if you click on it, takes you to the page with this heading:

<h1><span style="view-transition-name: recording-7812">The Banks Of The Moy</span></h1>

Why the span? Well, like Dave, I noticed some weird tweening happening between block and inline elements. Dave solved the problem with width: fit-content on the block-level element. I just stuck in an extra inline element.

Anyway, the important thing is that the name of the view transition matches: recording-7812.

I also added a view transition to pages that have maps. The position of the map might change from page to page. Now there’s a nice little animation as you move from one page with a map to another page with a map.

thesession.org View Transitions

That’s all good, but I found myself wishing that I could just have those enhancements. Every single navigation on the site was triggering a fade in and out—the default animation. I wondered if there was a way to switch off the default fading.

There is! That default animation is happening on a view transition named root. You can get rid of it with this snippet of CSS:

::view-transition-image-pair(root) {
  isolation: auto;
}
::view-transition-old(root),
::view-transition-new(root) {
  animation: none;
  mix-blend-mode: normal;
  display: block;
}

Voila! Now only the view transitions that you name yourself will get applied.

You can adjust the timing, the easing, and the animation properites of your view transitions. Personally, I was happy with the default morph.

In fact, that’s one of the things I like about this API. It’s another good example of declarative design. I say what I want to happen, but I don’t need to specify the details. I’ll let the browser figure all that out.

That’s what’s got me so excited about this API. Yes, it’s powerful. But just as important, it’s got a very low barrier to entry.

Chris has gathered a bunch of examples together in his post Early Days Examples of View Transitions. Have a look around to get some ideas.

If you like what you see, I highly encourage you to add view transitions to your website now.

“But wait,” I hear you cry, “this isn’t supported in any public-facing browser yet!”

To which, I respond “So what?” It’s a perfect example of progressive enhancement. Adding one meta element and a smidgen of CSS will do absolutely no harm to your website. And while no-one will see your lovely view transitions yet, once browsers do start shipping with support for the API, your site will automatically get better.

Your website will be enhanced. Progressively.

Update: Simon Pieters quite rightly warns against adding view transitions to live sites before the API is done:

in general, using features before they ship in a browser isn’t a great idea since it can poison the feature with legacy content that might break when the feature is enabled. This has happened several times and renames or so were needed.

Good point. I must temper my excitement with pragmatism. Let me amend my advice:

I highly encourage you to experiment with view transitions on your website now.

Tweaking navigation labelling

I’ve always liked the idea that your website can be your API. Like, you’ve already got URLs to identify resources, so why not make that URL structure predictable and those resources parsable?

That’s why the (read-only) API for The Session doesn’t live at a separate subdomain. It uses the same URL structure as the regular site, but you can request the resources in an alternative format: JSON, XML, RSS.

This works out pretty well, mostly because I put a lot of thought into the URL structure of the site. I’m something of a URL fetishist, but I think that taking a URL-first approach to information architecture can be a good exercise.

Most of the resources on The Session involve nouns like tunes, events, discussions, and so on. There’s a consistent and predictable structure to the URLs for those sections:

  • /things
  • /things/new
  • /things/search

And then an idividual item can be found at:

  • things/ID

That’s all nice and predictable and the naming of the URLs matches what you’d expect to find:

Tunes, events, discussions, sessions. Those are all fine. But there’s one section of the site that has this root URL:

/recordings

When I was coming up with the URL structure twenty years ago, it was clear what you’d find there: track listings for albums of music. No one would’ve expected to find actual recordings of music available to listen to on-demand. The bandwidth constraints and technical limitations of the time made that clear.

Two decades on, the situation has changed. Now someone new to the site might well expect to hit a link called “recordings” and expect to hear actual recordings of music.

So I should probably change the label on the link. I don’t think “albums” is quite right—what even is an album any more? The word “discography” is probably the most appropriate label.

Here’s my dilemma: if I update the label, should I also update the URL structure?

Right now, the section of the site with /tunes URLs is labelled “tunes”. The section of the site with /events URLs is labelled “events”. Currently the section of the site with /recordings URLs is labelled “recordings”, but may soon be labelled “discography”.

If you click on “tunes”, you end up at /tunes. But if you click on “discography”, you end up at /recordings.

Is that okay? Am I the only one that would be bothered by that?

I could update the URLs to match the labelling (with redirects for the old URLs, of course), but I’m not so keen on this URL structure:

  • /discography
  • /discography/new
  • /discography/search
  • /discography/ID

It doesn’t seem as tidy as:

  • /recordings
  • /recordings/new
  • /recordings/search
  • /recordings/ID

But if I don’t update the URLs to match the label, then I’m just going to have to live with the mismatch.

I’m just thinking out loud here. I think I should definitely update the label. I just won’t make any decision on changing URLs for a while yet.

Syndicating to Mastodon

I’ve been contemplating a checkbox. The label for this checkbox reads:

This is a bot account

Let me back up…

In what seems like decades ago, but was in fact just a few weeks, Elon Musk bought Twitter and began burning it to the ground. His admirers insist he’s playing some form of four-dimensional chess, but to the rest of us, his actions are indistinguishable from a spoilt rich kid not understanding what a social network is.

It wasn’t giving me much cause for anguish personally. For the past eight years, I’ve only used Twitter as a syndication endpoint for my own notes. But I understand that’s a very privileged position to be in. Most people on Twitter don’t have the same luxury of independence. It’s genuinely maddening and saddening to see their years of sharing destroyed by one cruel idiot.

Lots of people started moving to Mastodon. I figured I should do the same for my syndicated notes.

At first, I signed up for an account on mastodon.cloud. No particular reason. But that’s where I saw this very insightful post from Anil Dash:

When it came time to reckon with social media’s failings, nobody ran to the “web3” platforms. Nobody asked “can I get paid per message”? Nobody asked about the blockchain. The community of people who’ve been quietly doing this work for years (decades!) ended up being the ones who welcomed everyone over, as always.

I was getting my account all set up and beginning to follow some other folks, when I realised that I actually already had an exisiting account over on mastodon.social. Doh! Turns out that I signed up back in 2017 to kick the tyres, but never did much else because there weren’t many other people around back then. Oh, how times have changed!

Anyway, I thought I had really screwed up by having two accounts but this turned out to be an opportunity to experience some of the thoughtfulness in Mastodon’s design. The process of migrating from one Mastodon account to another—on a completely different instance—was very smooth! It was clear that this wasn’t an afterthought. This is an essential part of the fediverse and the design of the migration flow reflects that.

This gives me enormous peace of mind. If I ever want to switch to a different instance and still keep my network intact, I know it won’t be a problem. Mastodon is like the opposite of the roach-motel mentality that permeates most VC-backed so-called social networks.

As I played around some more—reading, following, exploring—my feelings of fondness only grew stronger. I like this place a lot!

I definitely wanted to syndicate my notes to Mastodon. At first, I implemented a straightforward RSS-to-Mastodon syndication using IFTTT (IF This, Then That), thanks to Matthias’s excellent tutorial.

But that didn’t feel quite right. When I syndicate to Twitter, I make a conscious choice each time. There’s a “Twitter” toggle that I can enable or disable in my posting interface. Mastodon deserved the same level of thoughtfulness.

So I switched off the IFTTT recipe and started exploring the Mastodon API. It’s going to sound like a humblebrag when I tell you that I got cross-posting working in almost no time at all, but that’s not a testament to my coding prowess (I’m really not very good), but rather a testament to the Mastodon API, which was a joy to work with.

  1. On your Mastodon instance, go to /settings/applications.
  2. Click on New Application.
  3. Fill in the details about your website and select write:statuses (and probably write:media) from the Scopes list.
  4. Copy Your access token to use in API calls.
  5. Write some sloppy code (in my case, PHP that uses CURL).

I did hit a wall when it came to posting images. That took me a while to get working, and I couldn’t figure out why. Was it something at Mastodon’s end while it was struggling under the influx of new users? As it turns out, no. It was entirely down to me being an idiot. (You know that situation where you’re working on a problem for ages and you’ve become convinced it’s an extremely gnarly rocket-science problem, but then turns out to be something stupid like a typo? Yeah. That.)

Then there’s the whole question of how to receive replies, likes, and reboosts from Mastodon here on my own site. Luckily, that was super easy, thanks to Brid.gy. One click and I was done. I love Brid.gy!

Take this note, for example. There’s a version on Twitter and a version on Mastodon. The original version on my own site gets responses from both places.

If I’m replying to a response on Twitter, I do not syndicate that to Mastodon.

Likewise, if I’m replying to a response on Mastodon, I do not syndicate that to Twitter.

Oh, one thing worth mentioning: if you’re sending a reply to something on Mastodon using the API, there’s an in_reply_to_id field for you to provide. But you should also include the full @username@instance of the person you’re replying to at the beginning of the message to ensure that it’s displayed as a reply rather than showing up as a regular post. Note the difference between this note on my site and its syndicated version on Mastodon.

Anyway, now I’m posting to Mastodon, but I’m doing it through the the interface of my own website. Which brings me to that checkbox in Mastodon’s profile settings:

This is a bot account

The help text reads:

Signal to others that the account mainly performs automated actions and might not be monitored

If I were doing the automatic cross-posting from RSS, I’d definitely tick that box. But as I’m making a conscious decision whenever I syndicate to Mastodon, I think I’m going to leave that checkbox unticked.

My cross-posting is not automated and I’m very much monitoring my Mastodon account …because I’m enjoying my Mastodon experience more than I’ve enjoyed anything online for quite some time. Highly recommended!

Negativity bias

When I wrote about my hopes and fears for the View Transitions API, a few people latched on to this sentiment:

If the View Transitions API only works for single page apps, it could be the single worst thing to happen to the web in years.

But I also wrote:

If the View Transitions API works across page navigations, it could be the single best thing to happen to the web in years.

I think it’s worth focusing on that.

Part of the problem is that I gave my hopes and fears an equal airing. But they’re not equally likely.

Take the possibility that the View Transitions API only ships for single page apps, but never ships for regular page transitions. The consequences of that would be big—the API would act as an incentive to build single page apps. But the likelihood of that happening is small. In fact, according to Jake, there’s already an implemention for page transitions in the works at Chrome.

Now what if the View Transitions API ships for pages? The consequences would be equally big—the API would act as an incentive to ditch single page apps and build in a more performant, resilient way. Best of all, the chances of that happening are very large indeed (pretty much a certainty now, given Jake’s update).

So I made a comparison between both of the consequences, which are equally large, but I didn’t make a corresponding comparison of the likelihoods, which are not equally large. Mea culpa!

I should’ve made it clearer that, although the consequences would be really bad if the View Transitions API only supports single page apps, the actual likelihood of that is pretty slim.

That’s probably my negativity bias showing through. (The reason I have a negativity bias is because I am a human. Like, have you ever noticed that if you get feedback on something and 98% of it is positive, you inevitably fixate on the 2%?)

Anyway, the real takeaway here is that if the View Transitions API ships for pages, then the consequences will be really, really good! It would be another nail in the coffin for monolithic JavaScript frameworks slowing down the web. And best of all, the likelihood of this happening is very high!

So let me amend my closing sentences from my previous post:

If the View Transitions API only works for single page apps—which is very unlikely—it could be the single worst thing to happen to the web in years.

If the View Transitions API works across page navigations—which is very, very likely—it could be the single best thing to happen to the web in years.

The glass is half full and it’s only going to get fuller. Time to start planning for a turbo-charged web now.

If you’ve got a website with full page navigations, start thinking about how you’ll be able to apply the View Transitions API as a progressive enhancement to improve the user experience.

If you’ve got a single page app, start thinking about how to ditch a whole bunch of uneccessary dependencies to make a more lightweight foundation of HTML instead of JavaScript, and still get all those slick transitions you get in a single page app!

Time for transitions

I am simultaneously very excited and very nervous about the View Transitions API.

You may know it by its former name—Shared Element Transitions. The name change is very recent.

I’ve been saying for years that some kind of API like this would be brilliant:

I honestly think if browsers implemented this, 80% of client-rendered Single Page Apps could be done as regular good ol’-fashioned websites.

Miriam Suzanne describes the theory of View Transitions succinctly:

Shared-element transitions are designed to work with standard web navigation across multiple page loads, as well as page transitions in ‘single-page’ apps (often called SPAs).

This all sounds brilliant. But the devil is in the implementation details. Right now, the API only works for single page apps. This is totally understandable. For purely pragmatic reasons, single page apps are a simple use case to solve for. It’s going to take a lot more work to get this API to work for multi-page apps (or as we used to call them, websites).

If we get a View Transitions API that works across page navigations, it could potentially turbo-charge the web. It will act as a disencentive to building single page apps—you’d be able to provide swish transitions without sacrificing performance or resilience at the alter of a heavy-handed JavaScript-only architecture.

But if the API only ever works for single page apps (which is the current situation), then it will act as an incentive to make any kind of website into a single page app, regardless of whether it’s actually the appropriate architecture.

That prospect has me very worried indeed.

I’m making my feelings on this known just in case any of the implementators out there are thinking, “Hey, maybe it’s fine that this API only works for single page apps—I’m sure most people would be happy with that.”

If the View Transitions API works across page navigations, it could be the single best thing to happen to the web in years.

If the View Transitions API only works for single page apps, it could be the single worst thing to happen to the web in years.

Update: Jake says:

We’re currently landing code in Chrome for the MPA version.

Very happy to hear that! It’s already in the spec, but it’s good to hear that the implementation isn’t going to lag too much.

Also, read this follow-up.

Image previews with the FileReader API

I added a “notes” section to this website eight years ago. I set it up so that notes could be syndicated to Twitter. Ever since then, that’s the only way I post to Twitter.

A few months later I added photos to my notes. Again, this would get syndicated to Twitter.

Something’s bothered me for a long time though. I initially thought that if I posted a photo, then the accompanying text would serve as a decription of the image. It could effectively act as the alt text for the image, I thought. But in practice it didn’t work out that way. The text was often a commentary on the image, which isn’t the same as a description of the contents.

I needed a way to store alt text for images. To make it more complicated, it was possible for one note to have multiple images. So even though a note was one line in my database, I somehow needed a separate string of text with the description of each image in a single note.

I eventually settled on using the file system instead of the database. The images themselves are stored in separate folders, so I figured I could have an accompanying alt.txt file in each folder.

Take this note from yesterday as an example. Different sizes of the image are stored in the folder /images/uploaded/19077. Here’s a small version of the image and here’s the original. In that same folder is the alt text.

This means I’m reading a file every time I need the alt text instead of reading from a database, which probably isn’t the most performant way of doing it, but it seems to be working okay.

Here’s another example:

In order to add the alt text to the image, I needed to update my posting interface. By default it’s a little textarea, followed by a file upload input, followed by a toggle (a checkbox under the hood) to choose whether or not to syndicate the note to Twitter.

The interface now updates automatically as soon as I use that input type="file" to choose any images for the note. Using the FileReader API, I show a preview of the selected images right after the file input.

Here’s the code if you ever need to do something similar. I’ve abstracted it somewhat in that gist—you should be able to drop it into any page that includes input type="file" accept="image/*" and it will automatically generate the previews.

I was pleasantly surprised at how easy this was. The FileReader API worked just as expected without any gotchas. I think I always assumed that this would be quite complex to do because once upon a time, it was quite complex (or impossible) to do. But now it’s wonderfully straightforward. Story of the web.

My own version of the script does a little bit more; it also generates another little textarea right after each image preview, which is where I write the accompanying alt text.

I’ve also updated my server-side script that handles the syndication to Twitter. I’m using the /media/metadata/create method to provide the alt text. But for some reason it’s not working. I can’t figure out why. I’ll keep working on it.

In the meantime, if you’re looking at an image I’ve posted on Twitter and you’re judging me for its lack of alt text, my apologies. But each tweet of mine includes a link back to the original note on this site and you will most definitely find the alt text for the image there.

Bugblogging

A while back I wrote a blog post called Web Audio API weirdness on iOS. I described a bug in Mobile Safari along with a hacky fix. I finished by saying:

If you ever find yourself getting weird but inconsistent behaviour on iOS using the Web Audio API, this nasty little hack could help.

Recently Jonathan Aldrich posted a thread about the same bug. He included a link to my blog post. He also said:

Thanks so much for your post, this was a truly pernicious problem!

That warms the cockles of my heart. It’s very gratifying to know that documenting the bug (and the fix) helped someone out. Or, as I put it:

Yay for bugblogging!

Forgive the Germanic compound word, but in this case I think it fits.

Bugblogging doesn’t need to involve a solution. Just documenting a bug is a good thing to do. Recently I documented a bug with progressive web apps on iOS. Before that I documented a bug in Facebook Container for Firefox. When I documented some weird behaviour with the Web Share API in Safari on iOS, I wasn’t even sure it was a bug but Tess was pretty sure it was and filed a proper bug report.

I’ve benefited from other people bugblogging. Phil Nash wrote Service workers: beware Safari’s range request. That was exactly what I needed to solve a problem I’d been having. And then that post about Phil solving my problem helped Peter Rukavina solve a similar issue so he wrote Phil Nash and Jeremy Keith Save the Safari Video Playback Day.

Again, this warmed the cockles of my heart. Bugblogging is worth doing just for the reward of that feeling.

There’s a similar kind of blog post where, instead of writing about a bug, you write about a particular technique. In one way, this is the opposite of bugblogging because you’re writing about things working exactly as they should. But these posts have a similar feeling to bugblogging because they also result in a warm glow when someone finds them useful.

Here are some recent examples of these kinds of posts—tipblogging?—that I’ve found useful:

All three are very handy tips. Thanks, Eric! Thanks, Rich! Thanks, Stephanie!

When should there be a declarative version of a JavaScript API?

I feel like it’s high time I revived some interest in my proposal for button type="share". Last I left it, I was gathering use cases and they seem to suggest that the most common use case for the Web Share API is sharing the URL of the current page.

If you want to catch up on the history of this proposal, here’s what I’ve previously written:

Remember, my proposal isn’t to replace the JavaScript API, it’s to complement it with a declarative option. The declarative option doesn’t need to be as fully featured as the JavaScript API, but it should be able to cover the majority use case. I think this should hold true of most APIs.

A good example is the Constraint Validation API. For the most common use cases, the required attribute and input types like “email”, “url”, and “number” have you covered. If you need more power, reach for the JavaScript API.

A bad example is the Geolocation API. The most common use case is getting the user’s current location. But there’s no input type="geolocation" (or button type="geolocation"). Your only choice is to use JavaScript. It feels heavy-handed.

I recently got an email from Taylor Hunt who has come up with a good litmus test for JavaScript APIs that should have a complementary declarative option:

I’ve been thinking about how a lot of recently-proposed APIs end up having to deal with what Chrome devrel’s been calling the “user gesture/activation budget”, and wondering if that’s a good indicator of when something should have been HTML in the first place.

I think he’s onto something here!

Think about any API that requires a user gesture. Often the documentation or demo literally shows you how to generate a button in JavaScript in order to add an event handler to it in order to use the API. Surely that’s an indication that a new button type could be minted?

The Web Share API is a classic example. You can’t invoke the API after an event like the page loading. You have to invoke the API after a user-initiated event like, oh, I don’t know …clicking on a button!

The Fullscreen API has the same restriction. You can’t make the browser go fullscreen unless you’re responding to user gesture, like a click. So why not have button type="fullscreen" in HTML to encapsulate that? And again, the fallback in non-supporting browsers is predictable—it behaves like a regular button—so this is trivial to polyfill. I should probably whip up a polyfill to demonstrate this.

I can’t find a list of all the JavaScript APIs that require a user gesture, but I know there’s more that I’m just not thinking of. I’d love to see if they’d all fit this pattern of being candidates for a new button type value.

The only potential flaw in this thinking is that some APIs that require a user gesture might also require a secure context (either being served over HTTPS or localhost). But as far as I know, HTML has never had the concept of features being restricted by context. An element is either supported or it isn’t.

That said, there is some prior art here. If you use input type="password" in a non-secure context—like a page being served over HTTP—the browser updates the interface to provide scary warnings. Perhaps browsers could do something similar for any new button types that complement secure-context JavaScript APIs.

Upgrade paths

After I jotted down some quick thoughts last week on the disastrous way that Google Chrome rolled out a breaking change, others have posted more measured and incisive takes:

In fairness to Google, the Chrome team is receiving the brunt of the criticism because they were the first movers. Mozilla and Apple are on baord with making the same breaking change, but Google is taking the lead on this.

As I said in my piece, my issue was less to do with whether confirm(), prompt(), and alert() should be deprecated but more to do with how it was done, and the woeful lack of communication.

Thinking about it some more, I realised that what bothered me was the lack of an upgrade path. Considering that dialog is nowhere near ready for use, it seems awfully cart-before-horse-putting to first remove a feature and then figure out a replacement.

I was chatting to Amber recently and realised that there was a very different example of a feature being deprecated in web browsers…

We were talking about the KeyboardEvent.keycode property. Did you get the memo that it’s deprecated?

But fear not! You can use the KeyboardEvent.code property instead. It’s much nicer to use too. You don’t need to look up a table of numbers to figure out how to refer to a specific key on the keyboard—you use its actual value instead.

So the way that change was communicated was:

Hey, you really shouldn’t use the keycode property. Here’s a better alternative.

But with the more recently change, the communication was more like:

Hey, you really shouldn’t use confirm(), prompt(), or alert(). So go fuck yourself.

Updating Safari

Safari has been subjected to a lot of ire recently. Most of that ire has been aimed at the proposed changes to the navigation bar in Safari on iOS—moving it from a fixed top position to a floaty bottom position right over the content you’re trying to interact with.

Courage.

It remains to be seen whether this change will actually ship. That’s why it’s in beta—to gather all the web’s hot takes first.

But while this very visible change is dominating the discussion, invisible changes can be even more important. Or in the case of Safari, the lack of changes.

Compared to other browsers, Safari lags far behind when it comes to shipping features. I’m not necessarily talking about cutting-edge features either. These are often standards that have been out for years. This creates a gap—albeit an invisible one—between Safari and other browsers.

Jorge Arango has noticed this gap:

I use Safari as my primary browser on all my devices. I like how Safari integrates with the rest of the OS, its speed, and privacy features. But, alas, I increasingly have issues rendering websites and applications on Safari.

That’s the perspective of an end-user. Developers who have to deal with the gap in features are more, um, strident in their opinions. Perry Sun wrote For developers, Apple’s Safari is crap and outdated:

Don’t get me wrong, Safari is very good web browser, delivering fast performance and solid privacy features.

But at the same time, the lack of support for key web technologies and APIs has been both perplexing and annoying at the same time.

Alas, that post also indulges in speculation about Apple’s motives which always feels a bit too much like a conspiracy theory to me. Baldur Bjarnason has more to say on that topic in his post Kremlinology and the motivational fallacy when blogging about Apple. He also points to a good example of critiquing Safari without speculating about motives: Dave’s post One-offs and low-expectations with Safari, which documents all the annoying paper cuts inflicted by Safari’s “quirks.”

Another deep dive that avoids speculating about motives comes from Tim Perry: Safari isn’t protecting the web, it’s killing it. I don’t agree with everything in it. I think that Apple—and Mozilla’s—objections to some device APIs are informed by a real concern about privacy and security. But I agree with his point that it’s not enough to just object; you’ve got to offer an alternative vision too.

That same post has a litany of uncontroversial features that shipped in Safari looong after they shipped in other browsers:

Again: these are not contentious features shipping by only Chrome, they’re features with wide support and no clear objections, but Safari is still not shipping them until years later. They’re also not shiny irrelevant features that “bloat the web” in any sense: each example I’ve included above primarily improving core webpage UX and performance. Safari is slowing that down progress here.

But perhaps most damning of all is how Safari deals with bugs.

A recent release of Safari shipped with a really bad Local Storage bug. The bug was fixed within a day. Yay! But the fix won’t ship until …who knows?

This is because browser updates are tied to operating system updates. Yes, this is just like the 90s when Microsoft claimed that Internet Explorer was intrinsically linked to Windows (a tactic that didn’t work out too well for them in the subsequent court case).

I don’t get it. I’m pretty sure that other Apple products ship updates and fixes independentally of OS releases. I’m sure I’ve received software updates for Keynote, Garage Band, and other pieces of software made by Apple.

And yet, of all the applications that need a speedy update cycle—a user agent for the World Wide Web—Apple’s version is needlessly delayed by the release cycle of the entire operating system.

I don’t want to speculate on why this might be. I don’t know the technical details. But I suspect that the root cause might not be technical in nature. Apple have always tied their browser updates to OS releases. If Google’s cardinal sin is avoiding anything “Not Invented Here”, Apple’s downfall is “We’ve always done it this way.”

Evergreen browsers update in the background, usually at regular intervals. Firefox is an evergreen browser. Chrome is an evergreen browser. Edge is an evergreen browser.

Safari is not an evergreen browser.

That’s frustrating when it comes to new features. It’s unforgivable when it comes to bugs.

At least on Apple’s desktop computers, users have the choice to switch to a different browser. But on Apple’s mobile devices, users have no choice but to use Safari’s rendering engine, bugs and all.

As I wrote when I had to deal with one of Safari’s bugs:

I wish that Apple would allow other rendering engines to be installed on iOS devices. But if that’s a hell-freezing-over prospect, I wish that Safari updates weren’t tied to operating system updates.