Microformats: Evolving the Web
A panel I sat in on at South by Southwest 2006. My fellow panelists are Chris Messina and Norm! The moderator is Tantek Çelik.
- Listen to the original audio recording.
- View Tantek’s slides
Tantek Çelik: Good morning everyone. Let’s go ahead and get started. This is the microformats session. I’m Tantek Çelik, your moderator for this panel. And we’ve got a great set of speakers here that are going to show off various things they’ve done with microformats. Anywhere from just adding some extra semantics to their sites to some tools that are already been developed that can consume and display and browse microformats.
First, I’d like to say this entire presentation, at least the hypertext of my presentation, is licensed under Creative Commons attribution license.
All right, so first I’m going to try and do a little demo. We’ll see if this works. We were having a little trouble earlier so it’s going to be a bit of a gamble with the network and all. So if I can just please ask everyone to turn off their BitTorrent for a couple minutes. Right, stop uploading photos to Flickr.
So we’ve got basically a version of the south by southwest speaker’s page marked up with the hCard microformat. And what you would’ve seen if the network was working, would be the entire set of speakers added to the address book. We’ll try this again in a few minutes to see if it’s working. What we’re doing is we’re passing the speaker’s page through a converter that converts all the hCards to vCards, and everyone of you out there that has a computer or phone has at least one or two implementations of the vCard compatible address book on you right now. The idea is that as soon as any page that has an hCard on it you can add to your address book, you can sync it with your PDA, your handheld, and it makes contact information, personal information, on the web a lot more useful.
I’m going to show you just a quick example of what an hCard looks like. So hCard is essentially the microformat contact information for a person or organization. In this case, here’s some information you might find about a person like name, URL, title, etc. You might mark this up with spans, hyperlinks (a
tags). And if it’s the contact information for a page, you might use the address
tag as well. This bit right here, this is the key hCard bit. So what we’ve done is we’ve taken the property names from the vCard standard, RFC2426, and simply reused them as class names in HTML. So class equals vcard, url, fn, n, given-name, family-name, title, org. And that’s really it. So just by adding that to existing markup you created an hCard.
But one point that I’d like to impress upon everyone is that microformats are more than just really good class names. And this is something that as microformats have become more popular a lot of folks are sort of taken to saying, “Oh, I’ve been using microformats for years.” And while microformats definitely use semantic class names and are taking advantage of this larger trend in web design towards using CSS for presentation, HTML for semantic markup, and then class names for catching additional IA and such in the page, microformats actually takes that a few steps further.
First, is that we’ve got a set of principles that essentially guide our design processes and help keep things micro as it were, help keep the number of class names down to a minimum, the number of microformats down to a minimum. I’m not going to go into detail about this principle. What I will go into a little more detail is the process. And this actually makes a really big difference because if you just have everyone out there inventing their own semantic class names well, that’s what we have today and that’s actually just fine. As long as you’re not looking to inter-operate that’s not a problem. One of the things you have to worry about is if you actually try to establish a standard, you need to inter-operate. So you actually need some amount of process, some amount of community, and we’ll talk a little about that. That’s where community part comes in. It helps makes sure that everyone tries to solve the same problem. We take the same approach rather than duplicating different ways of solving the same problem.
So let me just go really quickly over microformats process. We try to keep this down to an absolute minimum as much as possible.
The first thing is we ask people to pick a specific simple problem and define it. If you don’t have a specific problem that you’re trying to solve then you need to probably work hard and define it. We don’t want to solve really general-purpose problems; we don’t want to invent new frameworks. This is just for solving real world, practical, straightforward, specific problems.
Second, is that we require that anyone that wants to actually come up with a microformat actually go out there on the web and document what existing web pages are doing already with expressing this content. So say you want to build a microformat for widgets, whatever that means, right? If you can’t find web pages out there that are publishing widgets or publishing whatever that type of content, then maybe there’s not really a need for microformat per se, maybe you can just use semantic markup that you make up yourself.
The third is that we really, really want to avoid inventing new formats, new standards, new names, new terminology, even. We work really hard to go out there and document any existing formats that have tried to solve the same problem, whether these are web formats or whether they are formats for other transports or other types of other fields. You know RFC2426 vCard, that’s not an XML format but it solves the problem of contact information. Similarly, the iCalendar standard solves the problem of how do you express a calendar event. So the goal here is to document these existing formats so that we don’t invent new ones. If we do invent a new microformat, we try to take the names from existing formats. We try to not even invent new terminology.
Finally, after you’ve done all that, you try to brainstorm about… okay, given what people are actually publishing on the web (as opposed to what standards might say), from that you can apply a set of fields, what we call a schema, that would define a microformat, and then from there we reuse the names from those formats. Finally, and this is probably the most important piece, which is that we iterate within the community. So people put up a straw proposal based on their research, really early, really often. You try publishing it on your website and then you try and get feedback from the community. And then you iterate. So the goal here is not to try and get it perfect on the first try. The goal is to get something on the first try and evolve and iterate quickly and not feel like you have to support exactly what you came up with first.
All right, I want to show you guys a little bit of the progress we made with microformats over the last couple years.
Beginning in 2004, we had very few microformats. In fact, we had just defined the term “microformats” and XFN had been out a few weeks, which had a few implementations. rel-license had been proposed for Creative Commons and other licenses, XOXO for lists and outlines, and Votelinks for indicating that the page you’re linking to, you don’t necessarily agree with what it says or maybe you just abstain with what it says. So in two years the community has developed and come up with a whole lot more microformats. And you can sort of read them here for yourself. I’m not going to go into great detail. But literally for events, for people and organizations, for locations, for syndication, for classifieds, resumes and reviews, even tip jars for video blogging. These are all microformats that the community has come up with and documented and researched and published and, in fact, implemented. You can look under each one of these is a set of implementations. And there’s many more in development.
And how do you develop more microformats? Well, this is the community. We’ve got several portions to it. The IRC channel, which is live right now. The email discussion list with public archives. And this is a decision we made really early on; that we wanted to keep all of our research, all of our discussions in public. So literally the IRC channel is archived to a public web page. All the email discussion lists are archived. If you want to participate by blogging about microformats, use the microformats tag. And finally, the wiki is what we use for specifications, for feedback, for issues. And again, anyone can create a login and edit it. And this is one of the lessons that we learned from what Wikipedia has done. Rather than try and keep under very tight control the specifications, we said hey, let’s try and put all the specifications out in the open and let anyone come in there and fix the little mistakes. And it’s actually been tremendous, the helpfulness of community, contributions that we’ve seen.
So I’m going to close the little introduction with an exercise for the reader. Create your own hCard. Now this is actually fairly straight forward. We’ve got a nice hCard creator form that’s up on the microformats site, as well as a page that — if you’ve already got a contact page on your site, which a lot of folks do — a page on how to update the markup for that to just turn it into an hCard without having to create a new one. Publish it on your site. Go ahead and go to the hCard specification, there’s a whole little section that says “new examples in the wild”, and add a link to your hCard. And we’ve also even got a wiki page for this particular session as well. Go ahead and add yourself to the list of attendees.
And finally for any sort of follow-up Q&A that we don’t get to, we’re going to hold a pretty informal lunch right after this presentation at Las Manitas. So we’ll see how many of us can cram in there and have open discussions about it.
All right, so this entire presentation — as I’m sure many of them to date — is built with S5 in microformats and that’s the URL down there, if you want to go reference it and look at it for yourself: tantek.com/presentations/2006/03/microformats-sxsw.
So thanks, everyone, and now I’m introduce to you Mark Norman Francis of Yahoo UK.
Mark Norman Francis: Yep, that’s where I work. I am a senior web developer and I work on the pan-European websites mostly in the media area, which is things such as news and finance, movies. And when I first heard about microformats I got terribly excited for two basic reasons. One, I’m lazy, and two, I’m impatient. Because I’m lazy, with the Semantic Web in capital letters — capital S, capital W — you have to encode all of your information twice. And I just don’t want to do that. It’s far too much effort. And I’m impatient and I don’t want to wait for other people to mark things up twice as well.
Within the company I did a little informal talk to the web development team within Europe about microformats, and we decided that we would try it out on our site to see how it went, basically. And I’m going to show that to you now.
So I was working as the second developer on the movie site at the time. And one of the microformats is hReview. We have a lot of reviews in our site so we marked them up using hReview. If you see down here there’s a little microformat icon. So the Tails Firefox extension detects the presence of microformats and has indicated there is one on this page. So you just double click on that and it shows the two reviews on this page are there.
Oh that’s lovely… and there goes the wireless. Okay, you’ll just have to trust me about that one then, won’t you?
So on this page there is… the hReview microformat has an awful lot of information on it and our reviews don’t have anywhere near as much as that. Plus hReview was really — well the version I looked at at the time — was designed for individuals reviewing things. The reviews that we have more from actual data provided, such as in this case cinema source who other that people give us the data. So we didn’t use much of the hReview format but we used it anyway. Of course, other Yahoo properties that use hReview that I know that use microformats, as Tantek said earlier is blo.gs and Upcoming. Upcoming was the first of big sites that used hCalendar?
Tantek Çelik: Right. So Upcoming was definitely the first site that picked up the hCalendar format and deployed it across all their events. I actually don’t know how many events Upcoming has but it’s got a lot. I’m guessing there’s somewhere in the hundreds of thousands. If someone from Upcoming is here then perhaps they can let us know.
Mark Norman Francis: So, of course, that wasn’t owned by Yahoo at the time. So we can’t really take any credit for that but it’s nice that it’s there anyway. I can’t show it to you right now because it’s only in development and I can’t get onto the VPN to show it you but in Europe we’re going to be launching a local search product and all of the results that that will give you such as restaurants and business are all going to be marked up with hCard. And that’s in the seven figure range of results, whereas the movie site is only four or five figures of hReviews.
And obviously the last, Upcoming. I went to this party, I had an awful lot of alcohol. On this page, the information about this party is marked up using microformats and you can extract that and do what you like with it. And that’s me. Thank you.
Tantek Çelik: Thanks Mark. [clapping]. Next up we got Jeremy Keith, who’s going to show us a little something that he put together just for the conference and then showed around to a few friends. I’m like, “Wow, that’s pretty cool. You need to show that off.”
Jeremy Keith: Yeah. This isn’t going to be quite as big as Norm’s. He’s got about six millions…or something. I’ve got one page.
Mark Norman Francis: Yeah. It’s a very nice page.
Jeremy Keith: So this is my second year at South by Southwest. I was here last year and had a great time. And I’m sure as many people have told you, the panels are only half the story at South by Southwest. It’s the evening events that you really need to go to to get the most out of it and have all the fun. So coming up to this year I decided I’d put something on my web page about the evening events I’d be attending this year. So this is my website and I decided I’d create a page and put events on it.
Now I’ve already got events out there. I used Upcoming and there’s all sorts of South by Southwest stuff on there. I could just use the Upcoming API to pull stuff out or just scrape the hCalendar stuff out but there’s some information that’s not on Upcoming that I wanted to get at. That was geo-coordinates. I’ll show you why I wanted that. The other place I looked for events was on the South by Southwest website, and they have events marked up but these aren’t marked up in hCalendar. A couple of strong tags and paragraphs. So I was having to do a bit of cutting and pasting, but the idea is that I would maybe do a bit of cutting and pasting so that nobody else would have to.
So I got all the events together that I wanted to go to and list them all on a page and put it up on austin.adactio.com. So this is how it looks. I’ve got hCalendar and each event is marked up. So there’s extra information in there. You can see the time the start time of an event, the end time of an event. There’s hCards for the places. But this is the bit that interested me. I wanted to get geo-information in there so I did have to go and look up the geocodes. There’s a few different geocoders out there. Yahoo has a geocoder API you can use. So I had this marked up as hCalendar with geocodes so that I could use the geocodes to mess about with the Google Maps API. Basically as an exercise to myself. I hadn’t messed with this API yet and I wanted something fun to do so I thought it would be cool if I could have all my events on one page and you could also see where the events were taking place. So I took the existing page I had, which is marked up with the microformats, mixed in the Google Maps API, and I got this. So it’s the same page but just with a couple of extra script tags pointing to the Google Map scripts and a little bit of javascript so that when you click on an event, it shows you exactly where in Austin the event is taking place with a nice, big Shiner Bock logo to show you where it is.
So there’s all the parties I planned out that I was going to go to. I was very clever this year in that I made sure the panel I was doing was on the first day so that I could really let loose. And it was only later that Tantek said “Hey, do you want to speak about microformats?” and I was like…uh. So this explains why my voice is maybe a little rough because I’ve been making active use of this. But it has proved genuinely useful. I was at the Upcoming party last night and I thought, okay, I’m going to skip the web awards ceremony — it’s the Oscars for websites; I don’t need to see that — but I’ll go to the after-party for the web awards. So I know I’ve got to get from there to there and I can see exactly how far that is.
Or tonight for instance, tonight’s a busy night. I haven’t even put down what’s on tonight, all the stuff. There’s bowling going on, but that’s a bit out of town. I don’t think I’m going to make it. But if you are going, go Team Brits.
Look, I’ll explain why. The first event I’m going to go to is at Club DeVille. I was potentially going to go by 20x2 but, I’m sorry, that’s a bit too much of a hike to get over there and then have to come back to the Adaptive Path/Consumating/Odeo party. It makes much more sense… Look, they’re right next door to each other. [Laughter] And if it sucks… I don’t think it will ‘cause there’s going to be live bands playing. It’s going to be pretty good. Ben Brown, Internet rock star is putting it together so it’s bound to be good. But if it does I can nip around the corner to the Sidebar. How great is it that a bunch web geeks end up in a bar called the Sidebar? I find that perfect. So, looking I can see I don’t really need to get all the way — I’m sorry to whoever’s organizing 20x2; I’d like to go but the logistics of it — I think tonight’s events are pretty clearly planned out. I can just stay on Red River.
So I put this up. I had my microformats and I basically just had hCalendar and I had the Google maps mashup. And Tantek saw it and he said oh that’s so cool but why don’t you do this, and realized there were more microformats I could put in so that’s where the hCards came from. And the nice thing, when I did this and I hadn’t planned to do it, but you can link off to services on Technorati that will grab any hCalendars or any hCards on a page, just it pass it a URL and it will create an iCal for you. So I linked to Technorati and that will automatically download the iCal, which is very useful. Some friends of mine did this, they said, “I’m going to go to all the same events so I’m just going to pop that into my iCal.” And they show up like that. You can see all the ones in blue down on the bottom, they came from that page. And then I can sync it up with my phone, I can sync it up with my iPod. I don’t get the maps then, which is a bit of a shame, but because Austin is bathed in WiFi, as long as I have my laptop with me I can always find out where I need to go for the next party.
This did prove useful for other people, aside from when I first marked this up I wasn’t thinking about time zones, so when I first marked this up I didn’t put in the minus six hours. I’m in the Greenwich Mean tribe. I’m used to just thinking we’re the centre of the world. Time zones, what’s that? As Tantek pointed out, Austin was six hours behind England so I had to take that into account. But another friend of mine, Ian Lloyd, downloaded the calendar and he was saying the dates are all screwed up, but actually once you get to Austin and you set your preferences to US Central then the events fall into place and everything looks good. iCal could be a bit better with how it handles time zones, I think. It’s a little bit fuzzy but that’s a problem with iCal and certainly not with the format, the hCalendar.
So on Tantek’s recommendations I cleaned this up, put in the correct time zones, added the hCard. And the easiest microformat to add by far was to turn this into a XOXO, however you want to pronounce it. Tantek, do you have an opinion on how it should be pronounced? Hugs n’ kisses, I turned it into a hugs n’ kisses microformat by adding that. That was it because it’s a nested list. It’s lists within lists. That was by far the simplest microformat to add. So if anybody has cool tools that do stuff with XOXO like Greasemonkey scripts or bookmarklets that will toggle the list items they can do that with this page.
That’s pretty much it. That was just a little page I put together just for myself but it proved useful to other people, and I’ve been making good use of it.
Tantek Çelik: Thanks Jeremy. So one of the things I wanted to ask about is have you ever created an ICS file yourself, like on your website?
Jeremy Keith: No. What I’ve done in the past is put up an actual vCard for download. But I’ve never….
Tantek Çelik: Right. When you did that did you find any sort of challenges with say like MIME types or anything else dealing with that kind of thing?
Jeremy Keith: The vCard seem to be OK. There may be challenges but I didn’t discover them. But once I came across hCard I thought, well, that’s just so much simpler to do that. Have it on the web page itself. Downloadable if they want it.
Tantek Çelik: Right. So basically using the class names as opposed to uploading a file was a difference there in the experience.
Jeremy Keith: If you can have it in the markup rather than putting in any kind of file that has to downloaded before you can read what’s in it just makes so much more sense. It’s just more easily available.
Tantek Çelik: Okay. Cool. All right. Well, before we got to our last presenter, I actually found the problem I was having with the demo was my fault, not the service’s problem. I had a URL problem. So if you can switch back. So for those of you that actually loaded my presentation make sure you do a refresh, reload. It’s already been updated: real-time web that we have here.
So the fourth link there, link to convert hCards to vCards. If you click that, what it’ll do, like I said is send off the entire converted, the entire south by southwest speakers page with hCard markup added to it, then give you back a vCard file. This is actually a pretty sizable vCard file. Anyone want to take a guess, without actually counting obviously on the page, how many south by southwest speakers there are this year? About 400. So basically with a click of one link (now we’ll see if this works) you can have all 388 speakers added to your address book along with their URLS. So say you saw a speaker and you were like, “Oh what was that that I saw? You know what was her name?” You can go check it out later on. I find this personally very handy just because I’m horrible at remembering names. Those of you who’ve met me probably know this so I find this a useful tool for that respect. Just with one simple addition like that. And the nice thing is that should the list of south by southwest speakers change like, for example, on this panel we had two more speakers than originally scheduled, they can simply update that one page, the speakers page, and not worry about “Oh, there’s another side file I need to update” or whatever. And because it’s all just hCard markup then just re-import and get all the new stuff in there directly. That’s a lot of people.
All right. I’d like to introduce Chris Messina of Flock. He’s going to show us a little bit what happens when you can actually add features to a browser to take advantage of micro formats.
Chris Messina: All right. Afternoon everybody. My name is Chris Messina. I’m the Director of Experience and Open Source Ambassador for Flock, which doesn’t mean a whole lot, it just means that I go to parties and stuff. But actually what I’m going to be talking about today is about Flock and microformats and both where we are today, which is taking baby steps for this stuff, and where eventually we’d like to see it go.
So I’m going to start just very briefly describing how Flock as a browser sees the web and why that’s different from where browsers in the last 10 years have sort of been. To begin with, we see it really as an event stream. It’s all these conversations happening, all these things going on. You look at RSS, it’s this constant flow of information coming to you as opposed to static documents or books in a library kind of thing. I mean bookmarks came from somewhere, right? So we’re starting with that premise, with everything that we design in Flock. Second, it’s a social space. It’s a place for people to go to talk to each other, to meet up, to blog, to vidcast, podcasts and so on. And finally, where microformats comes in, we’re starting to see the web as an actual storage space, as a data store, as a place where people are housing data in web pages that are semantically marked up, that can be reused on the fly with all kinds of tools. I mean Tantek and all these guys showing a bunch of examples for how this thing works. If you actually have a browser that gets this stuff natively then you no longer have to go to all these other services and you can actually start pulling data from one web site to another one without thinking about it.
Right, so a couple things that kind of make this possible in Flock. We embed what’s called Lucene, which is an open source search engine. So we index every web page that you visit and create a local store of your history and your favourites with all the content on the web page and everything so that when you actually do a search for something in the browser, it pulls back says “Hey, I found all this stuff on all these different pages.” Right now the search is pretty dumb. It just goes throughout, sees all this stuff that you see. It has no idea what the content is. When you add microformats to that mix, we can actually figure out okay, here’s a person, here’s a review, here’s an event, and on and on. That means that we can actually index those things separately so extension developers, or even us at Flock, can build cool stuff like ad hoc calendars or mapping tools. And finally when you actually build in support for APIs in the browser, you can take that data that’s been indexed, search for it, and then toss it off to the various web services that you use. So, for example, let’s say that I’m on Upcoming and I want to send an event off to my 30 Boxes account. Well, if the browser actually understands that, you can simply right-click on an event, say “Send to my 30 Boxes account.” Something like that.
What this leads to is something we call roundtrip retention. Well, what the hell does that mean? Right? Well, if you think about it, we’re able to index all these things, and I’ll give you some examples of what those are. Basically, blogposts with links to people, lists of people and their blogs, your contact info and favourite places like on your blog About Page or stuff like that. Concert and movie reviews or product reviews — you’re talking about the latest iPod or whatever, you want people to be able to spider that information. Upcoming events and then, as Jeremy pointed out, parties and booze at south by. Now the whole roundtrip retention thing means that we can actually find all this information as you’re browsing the web, without you doing anything extra than you already do, and when you put that aside and allow it to go back see all the things that you found before. Because we pick up all the feeds on websites, we can also index stuff as it’s coming in. So you might not even be reading all your friends feeds but we could be pulling out events, reviews, contact information from those feeds as its being published and add those to a database that you can use later on.
So I’m going to show you just one implementation that we got going on right now. All right so this is Flock, you can get it at Flock.com/developer, it’s still a developer preview but it’s fairly stable if you don’t use it too hard.
So Calvin Yu, who created the Firefox Tails extension, actually went in and converted his extension to work in Flock. And he created what’s called FlockTails. So this is my blog right here. This is a party we set up at E-tech, and as you can see here again, the bottom right hand corner there’s microformats that have been found on this page. So this is that blogpost that I talked about. Now I’m going to go and I’m going to load up the FlockTails’ top bar. Now what this does is it goes and analyzes the page and looks for microformats. So in the page itself, just in the content, I’ve gone in right here and I’ve marked up my buddy Shawn’s name as a vCard or as an hCard actually. And so it’s actually pulled that out and this is his website. And then again the sponsors have their website linked right down in here. Okay, that’s relatively simple. Go to the Technorati 100 page. These are all the people who are really cool and hip on Technorati, I guess. Here you can see the same thing. As you’re browsing the web, you’re able to just pull this information in.
Now, again, we’re taking baby steps here, so this stuff is not being indexed yet in Flock, but when it is, all of a sudden the browser starts to understand who people are and what is a person and where you found that information on the web. For example, you go to like the Flock developer page. Right, these are all the Flock people. Check this out. You can see peoples’ photos and stuff like that. This stuff would automatically get added to your address book. You get avatars, you get their website address. As you actually go to other pages that lists these peoples as hCards, more and more information will get added to that. So if there’s a description some place else and you found their website some place else and their email some place else, we just keep adding it and adding it and adding it till you have this really rich profile of a person.
Moving on, this is Calvin Yu, the guy that did this extension. This is his about page. So not only has he marked up information about himself, as you can see here, but he’s also talked about and created a mashup with Yahoo of his favorite places. So you can see these here are little hReviews. So all this information’s available. Ryan King, who’s over here, was a Technorati Intern, has created a review of Spoon at the Warfield. So you can go in, get some information out. Now we could actually pull out the full text of the review so you can search it later. You can create like an ad hoc review if you wanted to. He’s also got his contact information marked up there. So, again, we’d grab his email address and so on. Upcoming, obviously a big example but you can check this out. I mean it’s got the website, it’s got where these things are happening. You could add other kinds of information to this if you wanted to. It’s got the date. Then finally if we go actually to Jeremy’s page, his mashup, you can see all that information that he was talking about has been pulled out here, with both event information, start time, end time, stuff like that, and the actual hCards for these venues. So you could actually take the venue information, drop it in your Blackberry, know exactly where he’s going to be, and stalk him like mad.
Jeremy Keith: Awesome.
Chris Messina: So that s basically where we’re at right now with microformats in Flock.
Tantek Çelik: That’s impressive. Thanks Chris. [clapping]
So I wanted to take this actually to now a interactive Q&A session because I think a lot of folks have heard a lot about microformats and I wanted to give the audience an opportunity to ask whatever questions you might have. Ask questions that the panellists, their experiences with developing, using microformats. So let’s see, any questions? Let’s start here, front row.
Chris Messina: The question is basically how do we filter out inaccurate versus accurate information? The early answer is I have no idea. The better answer over time is with our favouriting system. We have this thing called the star button and basically to create a favourite or to express interest in something, you just click the star button. That adds it into your favourite store, it sends it off to del.icio.us or Shadows or whatever and you’re done. What we’d like to see, or I’d like to see, is being able to star specific bits of information. We were on Ryan’s blog before so I’m going to do a search for Ryan, again, the history search. So this goes into my history and pulls out all the things where Ryan is actually listed. Now I don’t know anything about this information, if it’s legit or not. But presumably if I do a search for, let’s see… Okay, so Britpack is actually on this page and you can I just created this favourite. So in some ways I’ve added some degree of I-believe-this-is-good-because-I-favourited-it validity to that. Over time, you’ll be able to basically pull out a history of people that you’ve seen as well as your favourite people, and you’ll be able to go in and edit that information and lock it, for example, if you don’t want spammers to be able to pile up with stuff. We’re starting right now with a kernel of an idea and seeing what’s going to happen with it. I mean spam is obviously going to be a problem that we’re all going to have to address no matter who are and how we’re using this stuff. But I think in terms of where the browser sits, we’re in a very good place to try and mitigate by focusing on the sites that you actually actively visit and call your favourites.
Tantek Çelik: Does the data keep track of where it came from?
Chris Messina: I believe it will.
Tantek Çelik: So I guess one other answer to that is just like you go to a web page today and copy and paste information that you think is trustworthy, this sort of takes a lot of that copy and paste grunge work out of it.
Chris Messina: I think part of it is that again when you express interest and there’s feeds attached to a web page, Flock auto-subscribes to those things. So what we’d like to see is subscribing to hCards. And you’re absolutely right. Part of it is going to be to figure out the best user experience for balancing new information that’s discovered and maybe the official information, or something like that, that’s actually on the person’s hCard’s page or something. And those are questions that still need to be resolved.
Tantek Çelik: Okay, right here in the front row.
Chris Messina: So the question is about whether or not this data is going to be stored locally or remotely. If you have five computers for example, one at work, one at home and so on, where you’ll actually be able to have this data. And right now we’re starting very simple with synchronizing your bookmarks but eventually what we’d like to see is synchronizing your entire profile, and beyond that being able to syndicate information out to multiple services through APIs or even publishing back with hCard or, I’m sorry, microformats. It means you can spread this information around a lot easier. So in the near term it is local. I think in the far term, you’re going to start to see something of an Internet OS start developing cause Flock runs in Mac, PC and Linux. So if there’s ever a browser to kind of like move in that direction in being able to store this information in human writable and readable formats, hopefully it’ll be Flock.
Tantek Çelik: I mean, you can imagine how… so, an exercise left for the reader was everyone create their own hCard and publish on your site, right? All you need for that literally is you could just put your name and URL. You don’t have to put your phone number, email address, all that. But you can imagine how everyone publishes their own hCard and then if you had a future address book that, rather than just adding and importing vCards, would allow you the ability to subscribe to an hCard or vCard at a certain URL. All of a sudden you got essentially a worldwide distributed live contact information solution. And you don’t have to get anymore annoying emails from various services that say, “Hey, please update your contact information. This person has updated their contact information.” I’m not going to name names.
All right, right here in the middle.
So the question was: I’m sure you’re all familiar with Ray Ozzie’s web clipboard demonstration that he did at the O’Reilly Emerging Technologies conference last week where he basically took information from one web page contact information or calendar information or event, copied it and then pasted it as a whole unit into another web page. For example, like Outlook WebAccess or something. And the question was what’s he using to do that? Is he using hCard or hCalendar, or is it proprietary? Is that about right? So that’s a great question to bring up because the amazing thing here is he’s actually using hCard and hCalendar 100 percent. Nothing proprietary. We looked at the markup and it was a complete surprise to us as well. It was one of those amazing things. Now if you don’t know who Ray Ozzie is, he actually has quite a bit of background in this area. He worked on Lotus Notes and before that worked on vCard and iCalendar standards. So for him it was very natural to just say, “Hey let’s just use this standard these microformats guys have come up with to put on the web page rather than trying to invent our own thing”, which is a nice first for Microsoft.
All right, let’s see. How about over here.
The question was: publishing hCards and hCalendars is quite easy, but pulling that information back out isn’t quite so easy. Are there any tools or utilities for web developers to help them pull that information out?
So that actually brings up a couple good points I want to touch on. The first is that one of the things we did very differently with microformats than any other format that I know of is that we focused very much on making it easy to publish. We said there are thousands, hundreds of thousands more publishers than there are people writing programs that actually consume this information. So for us it made a lot more sense to make it easy for hundreds of thousands of people rather than dozens or hundreds of people. So we made a sort of economic trade-off there, which we think was the right trade-off. But what that meant was it’s a little bit more work if you’re a programmer or web developer to try to actually consume some of this content. But fortunately it means that people can also publish open source libraries to consume this content in various different languages and such. So one of the things we’ve done is in each microformat specification — hCard, hCalendar — there’s an implementation section. And if you look in those sections, you’ll see that folks have already published a lot of open source to consume hCard, hCalendar, hReview, xFolk, etc. It’s been really impressive what the community’s been coming up with in publishing. When I showed you the demo of importing those hCards into your address book, that’s all based on open source. So, yes, Technorati is hosting a service to do that nice and efficiently so anyone can create a link to automatically import, let their users add their hCards to their address books. But all of that’s based on open source. It’s just an XSLT transform that’s in PHP, and any web developer that wants to can go get that and deploy on their own site or do their own service, whatever.
Next question. Okay, right here.
The question was: are there any places like Facebook that have adopted things like hCard to make it easier to get the information out, sort of move it around? Why don’t I pass that off to Norm, who I think mentioned that briefly before, maybe he can expand upon it. What experience have you had with that, Norm?
Mark Norman Francis: None that I’m aware of.
Tantek Çelik: Well with respect to Yahoo, in particular. Doing contact information.
Mark Norman Francis: That’s just businesses not people.
Tantek Çelik: Oh, Okay.
Mark Norman Francis: Yeah, I can’t speak for the US web developers. I know one of, or at least one of them, is interested in microformats and talking internally about microformats. But in the US it’s seen slightly more politically, I think. So they’re not rushing ahead into it. They’re thinking of it a lot more. In Europe, we’re mental. [laughter] We just do what we want to do.
Tantek Çelik: I think Chris has something to add too.
Chris Messina: So there are a couple implementations out there and the funny thing is that you can run into these things randomly. One of them is a service called Plazes, which is kind of like a location service to track where you are, for your own purpose. They have or they will be implementing hCard all over the place and we’ve actually been working with Flickr, for example, to implement hCard. So if you can imagine going to your Flickr’s contacts page and having all of your Flickr’s contacts be added to your browser or your address book, and then being able to go to Facebook or being able to go to Upcoming or any site that espouses the social networking idea. Being able to bring your buddies with you is one of the things that we really want to solve because it’s ridiculous that you have to recreate your buddy list every time you start a new service.
Some of the other things that we’re doing, and part of my work as being an open source ambassador, besides partying, is to go out and work with the publishers. So, for example, working with Drupal, working with Wordpress, those folks who are actually creating the tools that create content on the web or allow you to publish to the web are going to start getting microformats implementation so that you create an event in Drupal, voila! You’ve got an hReview. Add a person, add a contact, voila! Same thing’s true for Flock. The blogging tool in Flock right now supports tags but eventually we would like to see customizability where you can add all kinds of different microformat creators. So when you do a blog post or you want to add a contact to your blogroll and so on, you can do that. And using something like XFN you can actually start to show relationships between sites and so on. You can start to create your own profile.
Another implementation by Terrell, I think, who’s here, is ClaimID, that’s ClaimID.com. And what they allow you to do is actually put all the links where you’re publishing on the web like your Flickr stream and your Upcoming accounts and your blog, and so on and so forth, and they mark it all up in hCard. And so effectively you have one place on the web that has all your information. Someone you can go there, add that card to your address book, subscribe to it, and then you got one place where you can actually get all this information about where people are, to get a much richer picture of who people are.
Tantek Çelik: Let me show you guys a pointer to where you can find even more sites that are publishing like legions, thousands or tens or hundreds of thousands of hCards as well. Which is if you go to microformats.org/wiki/hcard which is the hCard specification, you’ll see that aside from the format we’ve also got examples and examples in the wild, and a very long list. In fact, we’ll probably have to pretty soon move this to another page to wiki, reorganize it as it evolves. With all kinds of different sites that have added microformat support, a few of these I just want to call out are Avon. So everyone’s heard of Avon the make-up company or Avon Calling. We got this email out of the blue from the webmaster of Avon, saying “Hey, I just added hCard to all 40,000 Avon representative’s home pages.” So even outside of this whole web design/dev community that we got here, there’s folks in totally different sites and business that are recognizing that it’s easy to do, there are advantages to doing it. The Iowa military veteran’s band have marked up their contacts with hCard, now who knew? The University of Bath. Their people’s search functionality have added hCard support so that when you get the return results of like when you’re looking for someone on campus — you mentioned Facebook — they pop up in hCard format. So you can add them to your address book as well. It’s literally, there’s more and more sites coming on line so fast with more and more hCard support, it’s really hard to keep track of them all. Just keep looking on there.
Over here on the left.
So the question was: can you talk a little about the structured blogging initiative, and what’s the relationship to that to microformats. So initially structured blogging was another different way of trying to publish structured information on the web, on blogs in particular, using XML embedded in various script tags, which we actually frown upon that because it duplicates the information, puts it in a hidden location. But what they’ve done is they’ve actually updated, they’re now focusing on implementation. So structured blogging has a Wordpress plug-in, and Movable Type plug-ins, and I believe they’re working on Drupal, I’m not positive, you’re going to have to check out the site, that supports a whole bunch of different microformats, so hCard, hCalendar, hReview. I believe they support a whole bunch more. If you want to find out more about that implementation, check out structuredblogging.org. So that’s essentially what they’ve decided to focus on, more implementation rather than the formats.
So that was coming from the back from Mark Canter said that as many microformats as we could find, we would support, including hAtom. So one of the more recent microformats that’s been developed and published is a microformat version of the Atom standard. So literally if you want to add Atom 1.0 support to your site, all you have to do is add a few class names to your pages and run it through a similar converter and you can have an atom feed.
So one thing I want to point out to you folks is that here on the home page of microformats.org there’s a little subscribe link, if you click on that it’ll open up an iCal, and this is of course using microformats, using the same sort of thing, it’ll see the entire upcoming events that are related to microformats coming up here including, for example…
And I think we might have time for one more question. Who’s got a good one? Make it good. How about you over here.
Have you heard the term Roach Motel before? Absolutely, this seems like an incredible option to break it free, we couldn’t agree with you more. There is, one of the big reasons that a lot of us got really excited about microformats is that we’re tired of entering our information into all these different sites over and over and over again and maybe even having it locked up there and not being able to get it out. Just like RSS has essentially opened up the whole world of syndication, opened up all these different mashups and allowing data to mix back and forth. Data about syndicated content, we said “you know what, if it’s information about people, about events, about reviews, we don’t want that information locked up somewhere, we want it so that it can move around easily.” But for it to move around easily you need standards. So we went and worked on microformats. So yeah, this is exactly one of the key benefits is we want to make sure that we enable sites to not become roach motels but rather instead syndicate their data, just like their syndicating their RSS, and we wanted to make users more aware as well that, you should be using sites that actually let you take your data in and out rather than just sites that lock up your data and hold it hostage.
Jeremy Keith: Roach motels are so Web 1.0.
Tantek Çelik: Yes.
The question is: what about Google Base? I heard a few comments about roach motels from the audience. That’s an excellent question. So there wass an SDForum Search SIG several weeks ago on classifieds and things like that, and Google Base, and they presented. And we asked them, “Is Google Base ever going to ever open up their information to be crawled?” Because right now, any information that you submit to Google Base, no other service can pull that out, which seems kind of roach motel-ish to us. But they claim that they’re going to open that up in the future. In what format? We have no idea. But we can only be hopeful that they’ll do the right thing and not be evil.
All right, so on that note, thank you all for coming, big hand of applause to our presenters. [applause]
And go out there make some microformats and join the community and help participate and contribute. Thank you.