Journal tags: tags

11

Associative trails

Matt wrote recently about how different writers keep notes:

I’m also reminded of how writers I love and respect maintain their own reservoirs of knowledge, complete with migratory paths down from the mountains.

I have a section of my site called “notes” but the truth is that every single thing I post on here—whether it’s a link, a blog post, or anything else—is really a “note to self.”

When it comes to retrieving information from this online memex of mine, I use tags. I’ve got search forms on my site, but usually I’ll go to the address bar in my browser instead and think “now, what would past me have tagged that with…” as I type adactio.com/tags/... (or, if I want to be more specific, adactio.com/links/tags/... or adactio.com/journal/tags/...).

It’s very satisfying to use my website as a back-up brain like this. I can get stuff out of my head and squirreled away, but still have it available for quick recall when I want it. It’s especially satisfying when I’m talking to someone else and something they say reminds me of something relevant, and I can go “Oh, let me send you this link…” as I retrieve the tagged item in question.

But I don’t think about other people when I’m adding something to my website. My audience is myself.

I know there’s lots of advice out there about considering your audience when you write, but when it comes to my personal site, I’d find that crippling. It would be one more admonishment from the inner critic whispering “no one’s interested in that”, “you have nothing new to add to this topic”, and “you’re not quailified to write about this.” If I’m writing for myself, then it’s easier to have fewer inhibitions. By treating everything as a scrappy note-to-self, I can avoid agonising about quality control …although I still spend far too long trying to come up with titles for posts.

I’ve noticed—and other bloggers have corroborated this—there’s no correlation whatsover between the amount of time you put into something and how much it’s going to resonate with people. You might spend days putting together a thoroughly-researched article only to have it met with tumbleweeds when you finally publish it. Or you might bash something out late at night after a few beers only to find it on the front page of various aggregators the next morning.

If someone else gets some value from a quick blog post that I dash off here, that’s always a pleasant surprise. It’s a bonus. But it’s not my reason for writing. My website is primarily a tool and a library for myself. It just happens to also be public.

I’m pretty sure that nobody but me uses the tags I add to my links and blog posts, and that’s fine with me. It’s very much a folksonomy.

Likewise, there’s a feature I added to my blog posts recently that is probably only of interest to me. Under each blog post, there’s a heading saying “Previously on this day” followed by links to any blog posts published on the same date in previous years. I find it absolutely fascinating to spelunk down those hyperlink potholes, but I’m sure for anyone else it’s about as interesting as a slideshow of holiday photos.

Matt took this further by adding an “on this day” URL to his site. What a great idea! I’ve now done the same here:

adactio.com/archive/onthisday

That URL is almost certainly only of interest to me. And that’s fine.

Bookshop

Back at the start of the (first) lockdown, I wrote about using my website as an outlet:

While you’re stuck inside, your website is not just a place you can go to, it’s a place you can control, a place you can maintain, a place you can tidy up, a place you can expand. Most of all, it’s a place you can lose yourself in, even if it’s just for a little while.

Last week was eventful and stressful. For everyone. I found myself once again taking refuge in my website, tinkering with its inner workings in the way that someone else would potter about in their shed or take to their garage to strip down the engine of some automotive device.

Colly drew my attention to Bookshop.org, newly launched in the UK. It’s an umbrella website for independent bookshops to sell through. It’s also got an affiliate scheme, much like Amazon. I set up a Bookshop page for myself.

I’ve been tracking the books I’m reading for the past three years here on my own website. I set about reproducing that list on Bookshop.

It was exactly the kind of not-exactly-mindless but definitely-not-challenging task that was perfect for the state of my brain last week. Search for a book; find the ISBN number; paste that number into a form. It’s the kind of task that a real programmer would immediately set about automating but one that I embraced as a kind of menial task to keep me occupied.

I wasn’t able to get a one-to-one match between the list on my site and my reading list on Bookshop. Some titles aren’t available in the online catalogue. For example, the book I’m reading right now—A Paradise Built in Hell by Rebecca Solnit—is nowhere to be found, which seems like an odd omission.

But most of the books I’ve read are there on Bookshop.org, complete with pretty book covers. Then I decided to reverse the process of my menial task. I took all of the ISBN numbers from Bookshop and add them as machine tags to my reading notes here on my own website. Book cover images on Bookshop have predictable URLs that use the ISBN number (well, technically the EAN number, or ISBN-13, but let’s not go down a 927 rabbit hole here). So now I’m using that metadata to pull in images from Bookshop.org to illustrate my reading notes here on adactio.com.

I’m linking to the corresponding book on Bookshop.org using this URL structure:

https://uk.bookshop.org/a/{{ affiliate code }}/{{ ISBN number }}

I realised that I could also link to the corresponding entry on Open Library using this URL structure:

https://openlibrary.org/isbn/{{ ISBN number }}

Here, for example, is my note for The Raven Tower by Ann Leckie. That entry has a tag:

book:ean=9780356506999

With that information I can illustrate my note with this image:

https://images-eu.bookshop.org/product-images/images/9780356506999.jpg

I’m linking off to this URL on Bookshop.org:

https://uk.bookshop.org/a/980/9780356506999

And this URL on Open Library:

https://openlibrary.org/isbn/9780356506999

The end result is that my reading list now has more links and pretty pictures.

Oh, I also set up a couple of shorter lists on Bookshop.org:

The books listed in those are drawn from my end of the year round-ups when I try to pick one favourite non-fiction book and one favourite work of fiction (almost always speculative fiction). The books in those two lists are the ones that get two hearty thumbs up from me. If you click through to buy one of them, the price might not be as cheap as on Amazon, but you’ll be supporting an independent bookshop.

Links, tags, and feeds

A little while back, I switched from using Chrome as my day-to-day browser to using Firefox. I could feel myself getting a bit too comfortable with one particular browser, and that’s not good. I reckon it’s good to shake things up a little every now and then. Besides, there really isn’t that much difference once you’ve transferred over bookmarks and cookies.

Unfortunately I’m being bitten by this little bug in Firefox. It causes some of my bookmarklets to fail on certain sites with strict Content Security Policies (and CSPs shouldn’t affect bookmarklets). I might have to switch back to Chrome because of this.

I use bookmarklets throughout the day. There’s the Huffduffer bookmarklet, of course, for whenever I come across a podcast episode or other piece of audio that I want to listen to later. But there’s also my own home-rolled bookmarklet for posting links to my site. It doesn’t do anything clever—it grabs the title and URL of the currently open page and pre-populates a form in a new window, leaving me to add a short description and some tags.

If you’re reading this, then you’re familiar with the “journal” section of adactio.com, but the “links” section is where I post the most. Here, for example, are all the links I posted yesterday. It varies from day to day, but there’s generally a handful.

Should you wish to keep track of everything I’m linking to, there’s a twitterbot you can follow called @adactioLinks. It uses a simple IFTTT recipe to poll my RSS feed of links and send out a tweet whenever there’s a new entry.

Or you can drink straight from the source and subscribe to the RSS feed itself, if you’re still rocking it old-school. But if RSS is your bag, then you might appreciate a way to filter those links…

All my links are tagged. Heavily. This is because all my links are “notes to future self”, and all my future self has to do is ask “what would past me have tagged that link with?” when I’m trying to find something I previously linked to. I end up using my site’s URLs as an interface:

At the front-end gatherings at Clearleft, I usually wrap up with a quick tour of whatever I’ve added that week to:

adactio.com/links/tags/frontend

Well, each one of those tags also has a corresponding RSS feed:

…and so on.

That means you can subscribe to just the links tagged with something you’re interested in. Here’s the full list of tags if you’re interested in seeing the inside of my head.

This also works for my journal entries. If you’re only interested in my blog posts about frontend development, you might want to subscribe to:

adactio.com/journal/tags/frontend/rss

Here are all the tags from my journal.

You can even mix them up. For everything I’ve tagged with “typography”—whether it’s links, journal entries, or articles—the URL is:

adactio.com/tags/typography

The corresponding RSS feed is:

adactio.com/tags/typography/rss

You get the idea. Basically, if something on my site is a list of items, chances are there’s a corresponding RSS feeds. Sometimes there might even be a JSON feed. Hack some URLs to see.

Meanwhile, I’ll be linking, linking, linking…

Tagdiving

Speaking of URLs…

We were having a discussion in the Clearleft office recently about that perennially-tricky navigation pivot: tags. Specifically, we were discussing how to represent the interface for combinatorial tags i.e. displaying results of items that have been tagged with tag A and tag B.

I realised that this was functionality that I wasn’t even offering on Huffduffer, so I set to work on implementing it. I decided to dodge the interface question completely by only offering this functionality through the browser address bar. As a fairly niche, power-user feature, I’m not sure it warrants valuable interface real estate—though I may revisit that challenge later.

I can’t use the + symbol as a tag separator because Huffduffer allows spaces in tags (and spaces are converted to pluses in URLs), so I’ve settled on commas instead.

For example, there are plenty of items tagged with “music” (/tags/music) and plenty of items tagged with “science” (/tags/science) but there’s only a handful of items tagged with both “music” and “science” (/tags/music,science).

This being Huffduffer, where just about every page has corresponding JSON, RSS and Atom representations, you can also subscribe to the podcast of everything tagged with both “music” and “science” (/tags/music,science/rss).

There’s an OR operator as well; the vertical pipe symbol. You can view the 60 items tagged with “html5”, the 14 items tagged with “css3”, or the 66 items tagged with either “html5” or “css3” (/tags/html5|css3).

Wait a minute …66 items? But 60 plus 14 equals 74, not 66!

The discrepancy can be explained by the 8 items tagged with both “css3” and “html5” (/tags/html5,css3).

The AND and OR operators can be combined, so you can find items tagged with either “science” or “religion” that are also tagged with “politics” (/tags/science|religion,politics).

While it’s fun to do this in the browser’s address bar, I think the real power is in the way that the corresponding podcast allows you to subscribe to precisely-tailored content. Find just the right combination of tags, click on the RSS link, and you’re basically telling iTunes to automatically download audio whenever there’s something new that matches criteria like:

everything that is tagged with either “science fiction” or “fantasy” or “horror” and is also tagged with “romance” (/tags/science+fiction|fantasy|horror,romance),
everything that is tagged with “google” and is also tagged with either “apple” or “microsoft” or “yahoo” (/tags/google,apple|microsoft|yahoo).

I’m sure there are plenty of intriguing combinations out there. Now I can use Huffduffer’s URLs to go spelunking for audio gems at the most promising intersections of tags.

Small pieces, loosely joined by machine tags

I’ve already described how machine tags on Huffduffer trigger a number of third-party API calls. Tagging something with music:artist=..., book:author=..., film:title=... or any number of similar machine tags will fire off calls to places like Amazon, The New York Times, or Last.fm.

For a while now, I’ve wanted to include Flickr in that list of third-party services but I couldn’t think of an easy way of associating audio files with photos. Then I realised that a mechanism already exists, and it’s another machine tag. Anything on Flickr that’s been tagged with lastfm:event=... will probably be a picture of a musical artist.

So if anything is tagged on Huffduffer with music:artist=..., all I need to do is fire off a call to Last.fm to get a list of that artist’s events using the method artist.getEvents. Once I have the event IDs I can search Flickr for photos that have been machine tagged with those IDs.

There’s just one problem. Last.fm’s API only returns future events for an artist. There’s no method for past events.

Undeterred, I found a RESTful interface that provides the past events of an artist on Last.fm. The format returned isn’t JSON or XML. It’s HTML. It turns out that past events are freely available in the profile for any artist on Last.fm with the identifier last.fm/music/{artist}/+events/{year}. Here, for example, are Salter Cane gigs in 2009: last.fm/music/Salter+Cane/+events/2009.

If only those events were structured in hCalendar! As it is, I have to run through all the links in the document to find the hrefs beginning with the string http://www.last.fm/event/ and then extract the event ID that immediately follows that string.

Once I’ve extracted the event IDs for an artist, I can fire off a search on Flickr using the flickr.photos.search method with a machine_tags parameter (as well as passing the artist name in the text parameter just to be sure).

Here’s an example result in the sidebar on Huffduffer: huffduffer.com/tags/music:artist=Bat+for+Lashes

It’s messy but it works. I guess that’s the dictionary definition of a hack.

Machine tag browsing

After I started rewarding machine tagging on Huffduffer with API calls to Amazon and Last.fm, people started using them quite a bit. But when it came to displaying tag clouds, I wasn’t treating machine tags any differently to other tags. Everything was being displayed in one big cloud.

I decided it would be good to separate out machine tags and display them after displaying “regular” tags. That started me thinking about how best to display machine tags.

One of the best machine tag visualisations I’ve seen so far is Paul Mison’s Flickr machine tag browser, somewhat like the list view in OS X’s Finder. Initially, I tried doing something similar for Huffduffer: a table with three columns; namespace, predicate, and values.

That morphed into a two column layout (predicate and values) with the namespace spanning both columns. The values themselves are still displayed as a cloud to indicate usage.

This is marked up as a table. The namespace is in a th inside the thead. In the tbody, each tr contains a th for the predicate and td for the values.

<table>
 <thead>
  <tr>
   <th colspan="2"><a href="/tags/book">book</a></th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <th><a href="/tags/book:author">author</a></th>
   <td><a href="/tags/book:author=arthur+c.+clarke">arthur c. clarke</a></td>
  </tr>
 </tbody>
</table>

Table markup allows for some nice :hover styles (in browsers that allow :hover styles on more than links). Whenever you hover over a table cell, you are also hovering over a table row and a table. By setting :hover states on all three elements, wayfinding becomes a bit clearer.

table:hover thead th a
table tbody tr:hover th a
table tbody tr td a:hover

See for yourself. I think it’s a pretty sturdy markup and style pattern that I’ll probably use again.

Machine-tagging Huffduffer some more

After I wrote about the hoops I had to jump through to get Amazon’s API to output JSON (via XSLT), Tom detailed a way of avoiding JSON by using XML-RPC. That’s very kind of him but the truth is that:

I like dealing with JSON and
the XSL transformation is done by Amazon, not me; that wouldn’t be the case if I used XML-RPC.

Anyway, having successfully created a Huffduffer-Amazon bridge using machine tags, I thought I’d do a little more hacking. Instead of restricting the mashup love to Amazon, I figured that Last.fm would be the perfect place to pull in information for anything tagged with the music namespace.

Last.fm has quite a full-featured API and yes, it can output JSON. To start with, I’m using the artist.getInfo method for anything tagged with music:artist=..., music:singer=... or music:band=.... Here are some examples:

I’m pulling a summary of the artist’s bio, a list of similar artists and a picture of the artist in question. For maximum effect, view in Safari, the browser with the finest implementation of CSS3’s box-shadow property.

Nice as Last.fm’s API is, it’s not without its quirks. Like most APIs, the methods are divided into those that require authentication (anything of a sensitive nature) and those that don’t (publicly available information). The method user.getInfo requires authentication. Yet, every piece of information returned by that method is available on the public profile.

So when I wanted to find a Last.fm user’s profile picture—having figured out through Google’s Social Graph API when someone on Huffduffer has a Last.fm account—it made far more sense for me to use hKit to parse the microformatted public URL than to use the API method.

Just over two years ago, Drew delivered a superb presentation called Can Your Website Be Your API? In some situations, the answer is definitely “Yes.”

Update: It all ties together, as Julian explains on Twitter:

@adactio ha, I went to Drew’s presentation you mentioned on your blog; it made me add microformats to Last.fm in the first place :D

Machine-tagging Huffduffer

Over the weekend I was looking at the latest additions to Huffduffer. I noticed that Xavier Roy was using machine tags to tag a reading by Richard Dawkins. What an excellent idea!

I set aside a little time to do a little hacking with Amazon’s API. Now you can tag stuff on Huffduffer with machine tags like book:author=steven johnson, book:title=the invention of air or music:artist=my morning jacket. Other namespaces are film and movie. Anything matching that pattern will trigger a search on Amazon and display a list of results.

Amazon’s API was one of the first I ever messed about with, first on The Session and later on Adactio Elsewhere. There are things I really like about it and things I really dislike.

I dislike the fact that there’s no option to receive JSON instead of XML. However, one of the things I like is the option to pass the URL of an XSL file to transform the XML (I wish more APIs offered that service). So even though JSON isn’t officially offered, it’s perfectly feasible to generate JSON from the combination of XML + XSL. That’s what I did for the Huffduffer hacking—I find it a lot easier to deal with JSON than XML in PHP5. If you fancy doing something similar, help yourself to my XSL file. It’s very basic but it could make a decent starting point.

But the thing I dislike the most about the Amazon API is the documentation. It’s not that there’s a lack of documentation. Far from it. It’s just not organised very well. I find it very hard to get the information I need, even when I know that the information is there somewhere. Flickr still leads the pack when it comes to API documentation. Amazon would do well to take a leaf out of Flickr’s documentation book (hope you’re listening, Jeff).

Welcome to the machine tag

At the same time that Flickr are demonstrating idiocy in the human resources department, they continue to do so some very cool stuff behind the scenes.

Aaron has been walking through some new API methods over on the Flickr code blog, quoting something I said in a chat with Steve Ivy:

something:somethingelse=somethingspecific

…which I don’t even remember saying but the shoe fits.

There’s something about the mix of rigidity and haphazardness in machine tags that appeals to me. While they all share the same structure, everyone is free to invent their own usage. If machine tags were required to go through a specification process we would have event:lastfm=... and event:upcoming=... instead of lastfm:event=... and upcoming:event=... but really, it simply doesn’t matter as long as people are actually doing the tagging.

With the introduction of these new API methods, it looks like there’s room to build more finely-tuned apps to pivot around namespaces, predicates and values.

Paul Mison has written an desktop-like machine tag browser which shows at a glance just how many different machine tag namespaces are out there. Quite a few pictures have been tagged with adactio:post=... since I first introduced the idea.

Machine Tags of Loving Grace

One of the highlights of Refresh Edinburgh for me was listening to Dan Champion give a presentation on his new site, Revish. He talked through the motivation, planning and production of the site. This was an absolute joy to listen to and it was filled with very valuable practical advice.

Revish is a book review site with a heavy dollop of social interaction. Even in its not-quite-finished state, it’s pushing all the right buttons with me:

The markup is clean, semantic and valid.
The layout is uncluttered and flexible.
The URL structure is logical.
The data is available through microformats, RSS and an API.

There’s some really smart stuff going on with the sign-up process. If your chosen username matches a Flickr username, it automatically grabs the buddy icon. At the sign-up stage you also have the option of globally disabling any Ajax on the site—an accessibility option that I advocate in my book. Truth be told, there isn’t yet any Ajax on the site but the availability of this option shows a lot of forethought.

Also at the sign-up stage, there’s a quick’n’dirty auto-discovery of contacts wherever there’s overlap with Revish usernames and your Flickr contacts. This is very cool—one small step toward portable social networks.

One of the features dovetails nicely with Richard’s recent discussion about machine tags ISBNs. If you tag a picture of a book on Flickr with book:isbn=[ISBN number], that picture will then show up on the corresponding Revish page. You can see it in action on the page for Bulletproof Ajax.

Oh, and don’t worry about whether a book has any reviews on Revish yet: the site uses Amazon’s API to pull in the basic book info. As long as a book has an ISBN, it has a page on Revish. So the Revish page for a book can effectively become a mashup of Amazon details and Flickr pictures (just take a look at the page for John’s new microformats book).

I like this format for machine tagging information related to books. As pointed out in a comment on Richard’s post, this opens up the way for plenty of other tagging like book:title="[book title]" and book:author="[author name]".

I’ve started to implement this machine tag format here. If you look at my last post—which has a whole list of books—you’ll see that I’ve tagged the post with a bunch of machine tags in the book:isbn format. By making a quick call to Amazon, I can pull in some information on each book. For now I’m just displaying a small cover image with a link through to the Amazon page.

That last entry is a bit of an extreme example; I’m assuming that most of the time I’ll be just adding one book machine tag to a post at most, probably to accompany a review.

Machine tags (or triple tags) is still a relatively young idea. Most of the structures so far have been emergent, like Upcoming and Last.fm’s event tags and my own blog post machine tags. There’s now a site dedicated to standardising on some namespaces—MachineTags.org has a blog, a wiki and a mailing list. Right now, the wiki has pages for existing conventions like geo tagging and drafts for events and book tagging. This will be an interesting space to watch.

Ghost in the Machine Tags

Richard has some very nifty ideas up his sleeve for the next iteration of his site. Some of these are design-related and some are technical. He just gave a peek into the technical side of things by explaining how he’s using tags to tie content together. Not just any old tags, mind: machine tags.

You may remember that Flickr rolled out machine tags a while back. That’s their name for what’s basically tripletags; tags that take the form of namespace:predicate=value. There’s some tight integration between Upcoming and Flickr using the machine tag upcoming:event=[ID]. You can see a looser coupling (one way rather than bi-directional) in the recently-updated events section of Last.fm which uses lastfm:event=[ID]. As an example, take a look at the page for a Low Lows concert I went to and took pictures of.

Richard is making use of machine tagging to associate his Flickr pictures with his blog posts. He’s also planning to use Amazon’s API to associate ISBN numbers with blog posts, raising the question of which namespace to use:

We therefore need a triple-tag version of the ISBN tag, and here’s my suggestion: iso:isbn=0713998393. ISBN is a standard recognised by the International Organisation for Standardization (ISO) so I thought it made a certain sense for ISO to be the namespace. Other standardised entities could be tagged in a similar way, such as iso:issn=15340295.

Seems like a sound idea to me. I might experiment with machine tagging reviews here in that way and then pulling in complementary information from Amazon.

But that’s for another day. For now, I’ve gone ahead and integrated Flickr machine tagging here… but this works from the opposite direction. Instead of tagging my blog posts with flickr:photo=[ID], I’m pulling in any photos on Flickr tagged with adactio:post=[ID].

Now, I’ve already been integrating Flickr pictures with my blog posts using regular “human” tags, but this is a bit different. For a start, to see the associations using the regular tags, you need to click a link (then the Hijax-y goodness takes over and shows any of my tagged photos without a page refresh). Also, this searches specifically for any of my photos that share a tag with my blog post. If I were to run a search on everyone’s photos, the amount of false positives would get really high. That’s not a bug; it’s a feature of the gloriously emergent nature of human tagging.

For the machine tagging, I can be a bit more confident. If a picture is tagged with adactio:post=1245, I can be pretty confident that it should be associated with http://adactio.com/journal/1245. If any matches are found, thumbnails of the photos are shown right after the blog post: no click required.

I’m not restricting the search to just my photos, either. Any photos tagged with adactio:post=[ID] will show up on http://adactio.com/journal/[ID]. In a way, I’m enabling comments on all my posts. But instead of text comments, anyone now has the ability to add photos that they think are related to a blog post of mine. Remember, it doesn’t even need to be your Flickr picture that you’re machine tagging: you can also machine tag photos from your contacts or anyone else who is allowing their pictures to be tagged.

I realise that I’m opening myself up for a whole new kind of spam. But any kind of spam that requires namespaced tagging on a third-party site is pretty dedicated. If someone actually goes to that much effort to put a thumbnail of an inappropriate image at the end of one of my blog posts, I probably wrote something particularly inflammatory in that post—which would make the associated thumbnail a valid comment, I guess.

Here are some examples of posts I’ve been machine tagging on Flickr:

Once again, like Upcoming and Last.fm, these are event-based. But the machine tagging would work equally well for location-based posts. So when I go up to Scotland next week and blog about it, I (or you or anybody) can then go to Flickr, find some nice pictures of Edinburgh and using the adactio namespace, associate the pictures with the blog post.

It’s a strange mixture of RESTful URLs here and taggable objects there.

If nothing else, this will be an interesting experiment. Machine tags don’t have the low barrier to entry of regular tagging but they aren’t as complex as something like RDF. It might be that they hit the sweet spot between accuracy and ease of use.

Oh, and if you find any Flickr pictures related to this blog post, tag them with adactio:post=1274.