Journal tags: bbc

15

sparkline

BBC feedback

I just filled out this form on the BBC website. Here’s what I wrote, based on this open letter to the BCC Upper Management and Editorial Staff.

What is your complaint about?

BBC website or apps

Which website or app is your complaint about?

BBC News website

Please give the URL, or name of the app

https://www.bbc.co.uk/news/uk-england-57853385

Are you contacting us about a previous complaint?

No

Select the best category to describe your complaint

Standards of interviewing/presenting

What is the subject of your complaint?

Innacurate reporting and unreliable source

Please enter your complaint

The article is based on a single self selected study of 80 individuals sourced from Get The L Out, a group who, prior to the survey, were already united by anti-trans views.

This study breaks the BBC’s own guidelines about using surveys as sources for claims in coverage, as it is self-selected, with a small sample size and a clear bias held by those self-selected to respond.

The article dangerously frames this as a widespread issue, whilst simultaneously acknowledging that there is no actual evidence to that effect outside of isolated claims and cherry picked individual cases.

The article routinely implies that transgender women are not women, uncritically quoting people who call transgender women men without at any point clarifying that this is ignoring their legal status as women in the UK.

Design sprints on the Clearleft podcast

The sixth episode of the Clearleft podcast is now live: design sprints!

It comes in at just under 24 minutes, which feels just about right to me. Once again, it’s a dive into one topic that asks “What is this?”, “What does this mean?”, and “Where did this come from?”

I could’ve invited just about any of the practitioners at Clearleft to join me on this one, but I setttled on Chris, who’s always erudite and sharp.

I also asked ex-Clearleftie Jerlyn to have a chat. You’ll notice that’s been a bit of theme on the Clearleft podcast; asking people who used to work at Clearleft to share their thoughts. I’d quite like to do at least an episode—maybe even a whole season—featuring ex-Clearlefties exclusively. So many great people have worked at the agency of the years, Jerlyn being a prime example.

I’d also like to do an episode some time with the regular contractors we’ve worked with at Clearleft. On this episode, I asked the super-smart Tom Prior to join me.

I recorded those three chats over the past couple of weeks. And it was kind of funny how there was, of course, a looming presence over the topic of design sprints: Jake Knapp. I had sent him an email too but I got an auto-responder saying that he was super busy and would take a while to respond. So I kind of mentally wrote it off.

I spent last week assembling and editing the podcast with the excellent contributions from Jerlyn, Chris, and Tom. But it did feel a bit like Waiting For Godot the way that Jake’s book was being constantly referenced.

Then, on the weekend, Godot showed up.

Jake said he’d have time for a chat on Wednesday. Aargh! That’s the release date for the podcast! I don’t suppose Monday would work?

Very graciously, Jake agreed to a Monday chat (at an ungodly early hour in his time zone). I got an excellent half hour of material straight from the horse’s mouth—a very excitable and fast-talking horse, too.

That left me with just a day to work the material into the episode! I felt like a journalist banging on the keyboard at midnight, ready to run into the printing room shouting “Stop the press!” …although I’m sure the truth is that nobody but me would notice if an episode were released a little late.

Anyway, it all got done in the end and I think it turned out pretty great!

Have a listen for yourself and see what you make of it.

This was the final episode of the first season. I’ll now take a little break from podcasting as I plot and plan for the next season. Watch this space! …and, y’know, subscribe to the podcast.

Overlay gap

I think a lot about Danielle’s talk at Patterns Day last year.

Around about the six minute mark she starts talking about gaps and overlaps.

Gaps are where hidden complexity live. If we don’t have a category to cover it, in effect it becomes invisible. But that doesn’t mean it’s not there. Unidentified gaps cause inconsistency and confusion.

Overlaps occur when two separate categories encompass some of the same areas of responsibility. They cause conflict, duplication of effort, and unnecessary friction.

This is the bit I keep thinking about. It’s such an insightful lens to view things through. On just about any project, tensions are almost due to either gaps (“I thought someone else was doing that”) or overlaps (“Oh, you’re doing that? I thought we were doing that”).

When I was talking to Gerry on his new podcast recently, we were trying to figure out why web performance is in such a woeful state. I mused that there may be a gap. Perhaps designers think it’s a technical problem and developers think it’s a design problem. I guess you could try to bridge this gap by having someone whose job is to focus entirely on performance. But I suspect the better—but harder—solution is to create a shared culture of performance, of the kind Lara wrote about in her book:

Performance is truly everyone’s responsibility. Anyone who affects the user experience of a site has a relationship to how it performs. While it’s possible for you to single-handedly build and maintain an incredibly fast experience, you’d be constantly fighting an uphill battle when other contributors touch the site and make changes, or as the Web continues to evolve.

I suspect there’s a similar ownership gap at play when it comes to the ubiquitous obtrusive overlays that are plastered on so many websites these days.

Kirill Grouchnikov recently published a gallery of screenshots showcasing the beauty of modern mobile websites:

There are two things common between the websites in these screenshots that I took yesterday.

  1. They are beautifully designed, with great typography, clear branding, all optimized for readability.
  2. I had to install Firefox, Adblock Plus and uBlock Origin, as well as manually select and remove additional elements such as subscription overlays.

The web can be beautiful. Except it’s not right now.

How is this dissonance possible? How can designers and developers who clearly care about the user experience be responsible for unleashing such user-hostile interfaces?

PM/Legal/Marketing made me do it

I get that. But surely the solution can’t be to shrug our shoulders, pass the buck, and say “not my job.” Somebody designed each one of those obtrusive overlays. Somebody coded up each one and pushed them into production.

It’s clear that this is a problem of communication and understanding, rather than a technical problem. As always. We like to talk about how hard and complex our technical work is, but frankly, it’s a lot easier to get a computer to do what you want than to convince a human. Not least because you also need to understand what that other human wants. As Danielle says:

Recognising the gaps and overlaps is only half the battle. If we apply tools to a people problem, we will only end up moving the problem somewhere else.

Some issues can be solved with better tools or better processes. In most of our workplaces, we tend to reach for tools and processes by default, because they feel easier to implement. But as often as not, it’s not a technology problem. It’s a people problem. And the solution actually involves communication skills, or effective dialogue.

So let’s say it is someone in the marketing department who is pushing to have an obtrusive newsletter sign-up form get shoved in the user’s face. Talk to them. Figure out what their goals are—what outcome are they hoping to get to. If they don’t seem to understand the user-experience implications, talk to them about that. But it needs to be a two-way conversation. You need to understand what they need before you start telling them what you want.

I realise that makes it sound patronisingly simple, and I know that in actuality it’s a sisyphean task. It may be that genuine understanding between people is the wickedest of design problems. But even if this problem seems insurmoutable, at least you’d be tackling the right problem.

Because the web can’t survive like this.

Offline itineraries with service workers

The Trivago website is a progressive web app. That means it

  1. is served over HTTPS,
  2. has a web app manifest JSON file, and it
  3. has a service worker script.

The service worker provides an opportunity for a nice bit of fun branding—if you lose your internet connection, the site provides a neat little maze game you can play. Cute!

That’s a fairly simple example of how service workers can enhance the user experience when the dreaded offline situation arises. But it strikes me that the travel industry is the perfect place to imagine other opportunities for offline enhancements.

Travel sites often provide itineraries—think airlines, trains, or hotels. The itineraries consist of places, times, and contact information. This is exactly the kind of information that you might find yourself trying to retrieve in an emergency situation, like maybe in a cab on the way to the airport or train station. Perhaps you’re stuck in traffic, in a tunnel. Or maybe you don’t have a data plan for the country you’re currently in. Either way, wouldn’t it be great if you could hit the website for your airline or hotel and get your itinerary, even if you’re offline.

Alright, let’s think this through…

Let’s assume that an individual itinerary has its own URL. That URL is a web page of information, mostly text, with perhaps an image or two (like a map). Now when you make your booking, let’s have the service worker cache that URL (and its assets) for offline access.

Hmm …but there’s a good chance that the device you make the booking on is not the same device that you’d have with you out and and about. Because caches are local to the browser, that’s a problem.

Okay, but of these kinds of sites have some kind of log-in mechanism. So we could update the log-in flow a bit: when a user logs in, check to see if they have any itineraries assigned to them, and if they do, fire off an event to the service worker (using postMessage) to cache the URLs of the itineraries.

Now that the itineraries are cached, the final step is to create a custom offline page. As well as the usual “Sorry, the internet’s down” message, we can say “Sorry, the internet’s down …but here are your itineraries”. (This is kind of like the pattern you see on blogs like mine, Ethan’s, or Mike’s—a custom offline page that lists cached URLs of articles you’ve previously visited).

That’s just one pattern off the top of my head. It’s fun to imagine the different ways that service workers could be used to enhance the experience of just about any site, but they seem particularly relevant to travel sites—dodgy internet connections and travelling go hand-in-hand. At Clearleft, we’ve been working with quite a few travel-related clients lately so that’s why these scenarios are on my mind: booking holidays, flights, and so on. But, as I’ve said before and I’ll say again, every website can benefit from becoming a progressive web app.

Unlabelled search fields

Adam Silver is writing a book on forms—you may be familiar with his previous book on maintainable CSS. In a recent article (that for some reason isn’t on his blog), he looks at markup patterns for search forms and advocates that we should always use a label. I agree. But for some reason, we keep getting handed designs that show unlabelled search forms. And no, a placeholder is not a label.

I had a discussion with Mark about this the other day. The form he was marking up didn’t have a label, but it did have a button with some text that would work as a label:

<input type="search" placeholder="…">
<button type="submit">
Search
</button>

He was wondering if there was a way of using the button’s text as the label. I think there is. Using aria-labelledby like this, the button’s text should be read out before the input field:

<input aria-labelledby="searchtext" type="search" placeholder="…">
<button type="submit" id="searchtext">
Search
</button>

Notice that I say “think” and “should.” It’s one thing to figure out a theoretical solution, but only testing will show whether it actually works.

The W3C’s WAI tutorial on labelling content gives an example that uses aria-label instead:

<input type="text" name="search" aria-label="Search">
<button type="submit">Search</button>

It seems a bit of a shame to me that the label text is duplicated in the button and in the aria-label attribute (and being squirrelled away in an attribute, it runs the risk of metacrap rot). But they know what they’re talking about so there may well be very good reasons to prefer duplicating the value with aria-label rather than pointing to the value with aria-labelledby.

I thought it would be interesting to see how other sites are approaching this pattern—unlabelled search forms are all too common. All the markup examples here have been simplified a bit, removing class attributes and the like…

The BBC’s search form does actually have a label:

<label for="orb-search-q">
Search the BBC
</label>
<input id="orb-search-q" placeholder="Search" type="text">
<button>Search the BBC</button>

But that label is then hidden using CSS:

position: absolute;
height: 1px;
width: 1px;
overflow: hidden;
clip: rect(1px, 1px, 1px, 1px);

That CSS—as pioneered by Snook—ensures that the label is visually hidden but remains accessible to assistive technology. Using something like display: none would hide the label for everyone.

Medium wraps the input (and icon) in a label and then gives the label a title attribute. Like aria-label, a title attribute should be read out by screen readers, but it has the added advantage of also being visible as a tooltip on hover:

<label title="Search Medium">
  <span class="svgIcon"><svg></svg></span>
  <input type="search">
</label>

This is also what Google does on what must be the most visited search form on the web. But the W3C’s WAI tutorial warns against using the title attribute like this:

This approach is generally less reliable and not recommended because some screen readers and assistive technologies do not interpret the title attribute as a replacement for the label element, possibly because the title attribute is often used to provide non-essential information.

Twitter follows the BBC’s pattern of having a label but visually hiding it. They also have some descriptive text for the icon, and that text gets visually hidden too:

<label class="visuallyhidden" for="search-query">Search query</label>
<input id="search-query" placeholder="Search Twitter" type="text">
<span class="search-icon>
  <button type="submit" class="Icon" tabindex="-1">
    <span class="visuallyhidden">Search Twitter</span>
  </button>
</span>

Here’s their CSS for hiding those bits of text—it’s very similar to the BBC’s:

.visuallyhidden {
  border: 0;
  clip: rect(0 0 0 0);
  height: 1px;
  margin: -1px;
  overflow: hidden;
  padding: 0;
  position: absolute;
  width: 1px;
}

That’s exactly the CSS recommended in the W3C’s WAI tutorial.

Flickr have gone with the aria-label pattern as recommended in that W3C WAI tutorial:

<input placeholder="Photos, people, or groups" aria-label="Search" type="text">
<input type="submit" value="Search">

Interestingly, neither Twitter or Flickr are using type="search" on the input elements. I’m guessing this is probably because of frustrations with trying to undo the default styles that some browsers apply to input type="search" fields. Seems a shame though.

Instagram also doesn’t use type="search" and makes no attempt to expose any kind of accessible label:

<input type="text" placeholder="Search">
<span class="coreSpriteSearchIcon"></span>

Same with Tumblr:

<input tabindex="1" type="text" name="q" id="search_query" placeholder="Search Tumblr" autocomplete="off" required="required">

…although the search form itself does have role="search" applied to it. Perhaps that helps to mitigate the lack of a clear label?

After that whistle-stop tour of a few of the web’s unlabelled search forms, it looks like the options are:

  • a visually-hidden label element,
  • an aria-label attribute,
  • a title attribute, or
  • associate some text using aria-labelledby.

But that last one needs some testing.

Update: Emil did some testing. Looks like all screen-reader/browser combinations will read the associated text.

Digital Deathwatch

The Deatchwatch page on the Archive Team website makes for depressing reading, filled as it is with an ongoing list of sites that are going to be—or have already been—shut down. There are a number of corporations that are clearly repeat offenders: Yahoo!, AOL, Microsoft. As Aaron said last year when speaking of Museums and the Web:

Whether or not they asked to be, entire communities are now assuming that those companies will not only preserve and protect the works they’ve entrusted or the comments and other metadata they’ve contributed, but also foster their growth and provide tools for connecting the threads.

These are not mandates that most businesses take up willingly, but many now find themselves being forced to embrace them because to do otherwise would be to invite a betrayal of the trust of their users, from which they might never recover.

But occasionally there is a glimmer of hope buried in the constant avalanche of shit from these deletionist third-party custodians of our collective culture. Take Google Video, for example.

Earlier this year, Google sent out emails to Google Video users telling them the service was going to be shut down and their videos deleted as of April 29th. There was an outcry from people who rightly felt that Google were betraying their stated goal to organize the world‘s information and make it universally accessible and useful. Google backtracked:

Google Video users can rest assured that they won’t be losing any of their content and we are eliminating the April 29 deadline. We will be working to automatically migrate your Google Videos to YouTube. In the meantime, your videos hosted on Google Video will remain accessible on the web and existing links to Google Videos will remain accessible.

This gives me hope. If the BBC wish to remain true to their mission to enrich people’s lives with programmes and services that inform, educate and entertain, then they will have to abandon their plan to destroy 172 websites.

There has been a stony silence from the BBC on this issue for months now. Ian Hunter—who so proudly boasted of the planned destruction—hasn’t posted to the BBC blog since writing a follow-up “clarification” that did nothing to reassure any of us.

It could be that they’re just waiting for a nice quiet moment to carry out the demolition. Or maybe they’ve quietly decided to drop their plans. I sincerely hope that it’s the second scenario. But, just in case, I’ve begun to create my own archive of just some of the sites that are on the BBC’s death list.

By the way, if you’re interested in hearing more about the story of Archive Team, I recommend checking out these interviews and talks from Jason Scott that I’ve huffduffed.

Voice of the Beeb hive

Ian Hunter at the BBC has written a follow-up post to his initial announcement of the plans to axe 172 websites. The post is intended to clarify and reassure. It certainly clarifies, but it is anything but reassuring.

He clarifies that, yes, these websites will be taken offline. But, he reassures us, they will be stored …offline. Not on the web. Without URLs. Basically, they’ll be put in a hole in the ground. But it’s okay; it’s a hole in the ground operated by the BBC, so that’s alright then.

The most important question in all of this is why the sites are being removed at all. As I said, the BBC’s online mothballing policy has—up till now—been superb. Well, now we have an answer. Here it is:

But there still may come a time when people interested in the site are better served by careful offline storage.

There may be a parallel universe where that sentence makes sense, but it would have to be one in which the English language is used very differently.

As an aside, the use of language in the “explanation” is quite fascinating. The post is filled with the kind of mealy-mouthed filler words intended to appease those of us who are concerned that this is a terrible mistake. For example, the phrase “we need to explore a range of options including offline storage” can be read as “the sites are going offline; live with it.”

That’s one of the most heartbreaking aspects of all of this: the way that it is being presented as a fait accompli: these sites are going to be ripped from the fabric of the network to be tossed into a single offline point of failure and there’s nothing that we—the license-payers—can do about it.

I know that there are many people within the BBC who do not share this vision. I’ve received some emails from people who worked on some of the sites scheduled for deletion and needless to say, they’re not happy. I was contacted by an archivist at the BBC, for whom this plan was unwelcome news that he first heard about here on adactio.com. The subsequent reaction was:

It was OK to put a videotape on a shelf, but putting web pages offline isn’t OK.

I hope that those within the BBC who disagree with the planned destruction will make their voices heard. For those of us outside the BBC, it isn’t clear how we can best voice our concerns. You could make a complaint to the BBC, though that seems to be intended more for complaints about programme content.

In the meantime, you can download all or some of the 172 sites and plop them elsewhere on the web. That’s not an ideal solution—ideally, the BBC shouldn’t be practicing a deliberate policy of link rot—but it allows us to prepare for the worst.

I hope that whoever at the BBC has responsibility for this decision will listen to reason. Failing that, I hope that we can get a genuine explanation as to why this is happening, because what’s currently being offered up simply doesn’t cut it. Perhaps the truth behind this decision lies not so much with the BBC, but with their technology partner, Siemens, who have a notorious track record for shafting the BBC, charging ludicrous amounts of money to execute the most trivial of technical changes.

If this decision is being taken for political reasons, I would hope that someone at the BBC would have the honesty to say so rather than simply churning out more mealy-mouthed blog posts devoid of any genuine explanation.

Linkrotting

Yesterday’s account of the BBC’s decision to cull 172 websites caused quite a stir on Twitter.

Most people were as saddened as I was, although Emma described my post as being “anti-BBC.” For the record, I’m a big fan of the BBC—hence my disappointment at this decision. And, also for the record, I believe anyone should be allowed to voice their criticism of an organisational decision without being labelled “anti” said organisation …just as anyone should be allowed to criticise a politician without being labelled unpatriotic.

It didn’t take long for people to start discussing an archiving effort, which was heartening. I started to think about the best way to coordinate such an effort; probably a wiki. As well as listing handy archiving tools, it could serve as a place for people to claim which sites they want to adopt, and point to their mirrors once they’re up and running. Marko already has a head start. Let’s do this!

But something didn’t feel quite right.

I reached out to Jason Scott for advice on coordinating an effort like this. He has plenty of experience. He’s currently trying to figure out how to save the more than 500,000 videos that Yahoo is going to delete on March 15th. He’s more than willing to chat, but he had some choice words about the British public’s relationship to the BBC:

This is the case of a government-funded media group deleting. In other words, this is something for The People, and by The People I mean The Media and the British and the rest to go HEY BBC STOP

He’s right.

Yes, we can and should mirror the content of those 172 sites—lots of copies keep stuff safe—but fundamentally what we want is to keep the fabric of the web intact. Cool URIs don’t change.

The BBC has always been an excellent citizen of the web. Their own policy on handling outdated content explains the situation beautifully:

We don’t want to delete pages which users may have bookmarked or linked to in other ways.

Moving a site to a different domain will save the content but it won’t preserve the inbound connections; the hyperlinks that weave the tapestry of the web together.

Don’t get me wrong: I love the Internet Archive. I think that is doing fantastic work. But let’s face it; once a site only exists in the archive, it is effectively no longer a part of the living web. Yet, whenever a site is threatened with closure, we invoke the Internet Archive as a panacea.

So, yes, let’s make and host copies of the 172 sites scheduled for termination, but let’s not get distracted from the main goal here. What we are fighting against is .

I don’t want the BBC to take any particular action. Quite the opposite: I want them to continue with their existing policy. It will probably take more effort for them to remove the sites than to simply let them sit there. And let’s face it, it’s not like the bandwidth costs are going to be a factor for these sites.

Instead, many believe that the BBC’s decision is politically motivated: the need to be seen to “cut” top level directories, as though cutting content equated to cutting costs. I can’t comment on that. I just know how I feel about the decision:

I don’t want them to archive it. I just want them to leave it the fuck alone.

“What do we want?” “Inaction!”

“When do we want it?” “Continuously!”

Erase and rewind

In the 1960s and ’70s, it was common practice at the BBC to reuse video tapes. Old recordings were taped over with new shows. Some Doctor Who episodes have been lost forever. Jimi Hendrix’s unruly performance on Happening for Lulu would have also been lost if a music-loving engineer hadn’t sequestered the tapes away, preventing them from being over-written.

Except - a VT engineer called Bob Pratt, who really ought to get a medal, was in the habit of saving stuff he liked. Even then, the BBC policy of wiping practically everything was notorious amongst those who’d made it. Bob had the job of changing the heads on 2” VT machines. He’d be in at 0600 before everyone else and have two hours to sort the equipment before anyone else came in. Rock music was his passion, and knowing everything would soon disappear, would spend some of that time dubbing off the thing he liked onto junk tapes, which would disappear under the VT department floor.

To be fair to the BBC, the tape-wiping policy wasn’t entirely down to crazy internal politics—there were convoluted rights issues involving the actors’ union, Equity.

Those issues have since been cleared up. I’m sure the BBC has learned from the past. I’m sure they wouldn’t think of mindlessly throwing away content, when they have such an impressive archive.

And yet, when it comes to the web, the BBC is employing a slash-and-burn policy regarding online content. 172 websites are going to disappear down the memory hole.

Just to be clear, these sites aren’t going to be archived. They are going to be deleted from the web. Server space is the new magnetic tape.

This callous attitude appears to be based entirely on the fact that these sites occupy URLs in top-level directories—repeatedly referred to incorrectly as top level domains on the BBC internet blog—a space that the decision-makers at the BBC are obsessed with.

Instead of moving the sites to, say, bbc.co.uk/archive and employing a little bit of .htaccess redirection, the BBC (and their technology partner, Siemens) would rather just delete the lot.

Martin Belam is suitably flabbergasted by the vandalism of the BBC’s online history:

I’m really not sure who benefits from deleting the Politics 97 site from the BBC’s servers in 2011. It seems astonishing that for all the BBC’s resources, it may well be my blog posts from 5 years ago that provide a more accurate picture of the BBC’s early internet days than the Corporation does itself - and that it will have done so by choice.

Many of the 172 sites scheduled for deletion are currently labelled with a banner across the top indicating that the site hasn’t been updated for a while. There’s a link to a help page with the following questions and answers:

It’ll be interesting to see how those answers will be updated to reflect change in policy. Presumably, the new answers will read something along the lines of “Fuck ‘em.”

Kiss them all goodbye. And perhaps most egregious of all, you can also kiss goodbye to WW2 People’s War:

The BBC asked the public to contribute their memories of World War Two to a website between June 2003 and January 2006. This archive of 47,000 stories and 15,000 images is the result.

I’m very saddened to see the BBC join the ranks of online services that don’t give a damn for posterity. That attitude might be understandable, if not forgivable, from a corporation like Yahoo or AOL, driven by short-term profits for shareholders, as summarised by Connor O’Brien in his superb piece on link rot:

We push our lives into the internet, expecting the web to function as a permanent and ever-expanding collective memory, only to discover the web exists only as a series of present moments, every one erasing the last. If your only photo album is Facebook, ask yourself: since when did a gratis web service ever demonstrate giving a flying fuck about holding onto the past?

I was naive enough to think that the BBC was above that kind of short-sighted approach. Looks like I was wrong.

Sad face.

Speed

From BBC News at 15:07 GMT on Tuesday, March 3rd, Space rock makes close approach:

The object, known as 2009 DD45, thought to be 21-47m (68-152ft) across, raced by our planet at 13:44 GMT on Monday.

From Low Flying Rocks on Twitter at 13:45 GMT on Monday, March 2nd:

2009 DD45 just passed the Earth at 9km/s, approximately seventy-four thousand, eight hundred km away.

You are iPlayer

Now that the BBC iPlayer has been sensibly implemented in Flash, rather than as a proprietary Windows-only app, it turns out to be quite useful. Should I ever miss an episode of or, God forbid, , I can catch up at my leisure.

But there are two major problems with the iPlayer:

  1. It is only available in the UK,a condition imposed by the licence fee system and enforced with IP sniffing.
  2. Programmes are available for seven days. Then they’re gone.

Both of these limitations are unwebby but that second bit of self-crippling is particularly galling as the boffins at the BBC, in their attempt to appear more 2.0, have added a “Share” button to every show on the iPlayer, prompting you to bookmark the current episode on sites like Digg, Del.icio.us and Stumbleupon. I’d be very curious to find out if anyone is actually making use of these links. I don’t know who should be considered more idiotic: the BBC webmonkeys for encouraging people to link to a time-limited URI or the people foolish enough to actually bookmark a resource that has just a week to live.

To quote Sir Timbo: .

Using socially-authored content to provide new routes through existing content archives

Rob Lee is talking about making the most of user-authored (or user-generated) content. In other words, content written by you, Time’s person of the year.

Wikipedia is the poster child. It’s got lots of WWILFing: What Was I Looking For? (as illustrated by XKCD). Here’s a graph entitled Mapping the distraction that is Wikipedia generated from a greasemonkey script that tracks link paths.

Rob works for Rattle Research who were commissioned by the BBC Innovation Labs to do some research into bringing WWILFing to the BBC archive.

Grab the first ten internal links from any Wikipedia article and you will get ten terms that really define that subject matter. The external links at the end of an article provide interesting departure points. How could this be harnessed for BBC news articles? Categories are a bit flat. Semantic analysis is better but it takes a lot of time and resources to generate that for something as large as the BBC archives. Yahoo’s Term Extractor API is a handy shortcut. The terms extracted by the API can be related to pages on Wikipedia.

Look at this news story on organic food sales. The “see also” links point to related stories on organic food but don’t encourage WWILFing. The BBC is a bit of an ivory tower: it has lots of content that it can link to internally but it doesn’t spread out into the rest of the Web very well.

How do you decide what would be interesting terms to link off with? How do you define “interesting”? You could use Google page rank or Technorati buzz for the external pages to decide if they are considered “interesting”. But you still need contextual relevance. That’s where del.icio.us comes in. If extracted terms match well to tags for a URL, there’s a good chance it’s relevant (and del.icio.us also provides information on how many people have bookmarked a URL).

So that’s what they did. They called it “muddy boots” because it would create dirty footprints across the pristine content of the BBC.

The “muddy boots” links for the organic food article links off to articles on other news sites that are genuinely interesting for this subject matter.

Here’s another story, this one from last week about the dissection of a giant squid. In this case, the journalist has provided very good metadata. The result is that there’s some overlap between the “see also” links and the “muddy boots” links.

But there are problems. An article on Apple computing brings up a “muddy boots” link to an article on apples, the fruit. Disambiguation is hard. There are also performance problems if you are relying on an external API like del.icio.us’s. Also, try to make sure you recommend outside links that are written in the same language as the originating article.

Muddy boots was just one example of using some parts of the commons (Wikipedia and del.icio.us). There are plenty of others out there like Magnolia, for example.

But back to disambiguation, the big problem. Maybe the Semantic Web can help. Sources like Freebase and DBpedia add more semantic data to Wikipedia. They also pull in data from Geonames and MusicBrainz. DBpedia extracts the disambiguation data (for example, on the term “Apple”). Compare terms from disambiguation candidates to your extracted terms and see which page has the highest correlation.

But why stop there? Why not allow routes back into our content? For example, having used DBpedia to determine that your article is about Apple, the computer company, you could an hCard for the Apple company to that article.

If you’re worried about the accuracy of commons data, you can stop. It looks like Wikipedia is more accurate than traditional encyclopedias. It has authority, a formal review process and other tools to promote accuracy. There are also third-party services that will mark revisions of Wikipedia articles as being particularly good and accurate.

There’s some great commons data out there. Use it.

Rob is done. That was a great talk and now there’s time for some questions.

Brian asks if they looked into tying in non-text content. In short, no. But that was mostly for time and cost reasons.

Another question, this one about the automation of the process. Is there still room for journalists to spend a few minutes on disambiguating stories? Yes, definitely.

Gavin asks about data as journalism. Rob says that this particularly relevant for breaking news.

Ian’s got a question. Journalists don’t have much time to add metadata. What can be done to make it easier — it is an interface issue? Rob says we can try to automate as much as possible to keep the time required to a minimum. But yes, building things into the BBC CMS would make a big difference.

Someone questions the wisdom of pushing people out to external sources. Doesn’t the BBC want to keep people on their site? In short, no. By providing good external references, people will keep coming back to you. The BBC understand this.

Beeb

Flickr have launched a new stats feature for pro members. It’s very nicely done with lovely graphs and lists. It kept me occupied for at least five minutes. Personally, I’m just not all that into tracking referrers but it’s really nice that this data is available.

One of my more popular pictures lately is a surreptitious snapshot of the new BBC homepage that I snapped at BarCamp London 3. The photo generated quite a bit of interest and speculation. Fortunately there’s no longer any need for pundits to form their opinions based on a blurry photo of mine—the BBC blog has revealed that the new homepage is available to preview as a beta.

The greek letter isn’t the only Web 2.0 cliché that has been embraced:

  • Rounded corners: check,
  • Sloppy gradients: check,
  • Garish colours: check,
  • Drag’n’drop: check.

To be honest, it all feels a bit . That said, some of the interactions work very nicely and everything still works fine without JavaScript.

Overall it’s fine but some of the visual design elements irritate me. The gradients, as I said, are sloppy. As is so often the case with gradients, if they aren’t done subtly, they just look dirty. Then there’s the giant Verdana headings. Actually, I kind of admire the stubbornness of the BBC in using a font that really only works well at small sizes.

But the biggest issue—and this was the one that generated the most debate at BarCamp—is the way that clicking a link under the big image changes the colour of the entire page. I like the idea of pushing the envelope with CSS like that but the effect is just too extreme. It implies a relationship between the action of clicking that link and changes to other areas of the page. No such relationship exists. Confusion ensues.

I love the clock in the corner though.

Irritation

Dear Auntie Beeb,

Like countless pedants before me, I am sad enough to take some time out of my day to point out a minor error in the article On the road with wi-fi and video:

Nokia’s gadget suffers the sins of many of its mobile phones — confusing menus and a sluggish response make it irritable to use.

While I have no doubt that having a journalist constantly pressing its buttons would make any device irritable, I suspect that the intended meaning is that the device is irritating to use.

Insert standard closing remark about license fees and education standards including the words “in this day and age” somewhere.

Yours,

Irritated in Brighton.

P.S. Are you reading Grammarblog? Then your life is not yet complete. Go, read and nod your head vigorously in agreement on issues such as “loose” vs. “lose” and “I could care less”.

Talking with the BBC about microformats

I have now had the pleasure of visiting the BBC. Even though I know a few people who work at the BBC—and many more who used to—I’ve never had the opportunity before to get a look under Auntie Beeb’s new media skirts. Then I got an email asking if I’d be willing to come in and chat about microformats.

I’m always more than willing to rabbit on at length about microformats. Just wind me up and watch me go. It’s particulary pleasurable to natter on to a bunch of smart people working at Europe’s largest website.

There seems to be quite a lot of interest in microformats at the BBC. I spoke in a meeting room packed to the gills with people from a number of different departments. There are quite a few separate areas where people are already experimenting with hCalendar, hCard and rel-tag. Of those, hCalendar is clearly the forerunner: consider that schedule listings are essentially displaying a series of events.

Seeing as I was over at the BBC anyway, I took the opportunity to meet up with Ian for lunch. We compared notes on Hackday and he let me know the the Backstage folks were intrigued by Hackfight. This could be the start of a beautiful friendship.

With my visit to the BBC in East London at end, I hopped on the Central Line all the way across town for a quick visit to the Last.fm HQ. I always like getting a behind-the-scenes look at websites that I make use of on a daily basis. Hannah even managed to take some time out of her busy schedule to go for a coffee—it’s those CBS dollars at work.

All in all, it was a fun day out in London. But I was still glad to get back to Brighton… especially ‘cause I made it back in time for the fun at the Geek Wine Thing. London’s fine in small doses but I wouldn’t want to do that commute every day.