Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document sets out the use cases and requirements that have motivated the development of the Protocol for Web Description Resources (POWDER). The use cases address social and commercial needs to provide information about groups of Web resources, such as those available from a Web site, to aid the annotation and/or personalization of content for end users in varying delivery contexts.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a W3C Working Group Note of the POWDER Use Cases and Requirements, developed by the POWDER Working Group as part of the Semantic Web Activity. It is the third version of the document to be published, the revision being carried out largely to accommodate a new variant on the accessibility use case that the working group felt was important.
The Working Group believes the document to be stable and therefore to be suitable as the basis for the group's ongoing work. The differences between this document and the two previous versions can be found in the change log.
Please send comments about this document to [email protected] (with public archive).
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
The development of the Protocol for Web Description Resources has been motivated by both commercial and social concerns. On the social side, there is a demand for a system to identify content that meets certain criteria as they apply to specified audiences. Commercially, there is a demand to be able to personalize content for a particular user or delivery context.
POWDER will address these demands by defining a method through which relatively small amounts of metadata, that can be produced quickly and easily, can be applied to large amounts of content.
The use cases and requirements for POWDER were originally developed under the Web Content Label Incubator Activity. They have been revised and updated for this Working Group Note.
The generic use case for profile matching is that a user receives content suited to their delivery context; that is, the combination of user preferences, device capabilities and current state at the time of content delivery. Description Resources facilitate this decision by making available rules about groups of Web resources to a Web Server. At request time the Web Server can determine if there are any rules in the DR which apply to the requested URI, and respond to the User accordingly.
This extended use case refers to the following terms.
User: A human who perceives and interacts with the Web.
Device: Apparatus through which a user can perceive and interact with the Web
Server: The role adopted by an application when it is supplying resources or resource manifestations.
Network Operator: A mobile telephony and data infrastructure provider.
Adaptation: a process of selection, generation or modification of Web content to suit the given delivery context.
Web content: any resource retrievable via a URI over the World Wide Web intended for direct user consumption (Web pages and audio/visual media types).
Then either
or
This generic example uses the phrase 'delivery context'. This means different things in different circumstances. For example, in sub use cases 2.1.2 and 2.1.3 it means the characteristics of the device being used to access the content, in 2.1.4 it's the connection bandwidth that is important, whereas in sub uses cases 2.1.5 - 2.1.7 the delivery context is related to a profile of the end user. It is assumed that the relevant features of the delivery context will be determined using a technique appropriate to the use case. For example, device characteristics are likely to be retrieved from a repository whilst child protection software is likely to work with several sources of data in addition to Description Resources. Matching of resources against end users is likely to involve reference to an Identity Management System which is beyond the scope of POWDER.
Motivates: 3.1.1 Making Assertions; 3.1.2 The Role of a Description Resource; 3.1.3 Grouping; 3.1.4 Composite Assertions; 3.1.8 Reference; 3.1.9 Standard Vocabularies; 3.1.10 Identity; 3.1.11 Unambiguous; 3.2.1 Authentication; 3.2.2 Separation of Description and Resource; 3.3.1 Machine-Readable; 3.3.2 Formal Grammar;
There are several possible models in which assertions and claims can be made, authenticated and reported to the end user. Each of the following has several elements in common but differs in details such as whether it is the content provider or the trust mark operator that makes the original claim, whether the data is stored on the trust mark operator's servers or alongside the content itself, and whether the trust mark operator provides the description or the authentication for a description.
Joseph installs a web browser plugin on his personal computer designed to aggregate and interpret safety/reliability information about Web sites from various sources, such as reputation and accreditation services. The web browser plugin provides a visual indication of whether the Web site that Joseph is currently visiting is considered trustworthy or not.
The plugin retrieves information about Web sites using several methods. One of these methods involves querying the Description Resource associated with a Web site.
Joseph visits a Web site to do some holiday shopping. The browser plugin identifies the site's Description Resource, which asserts that the Web site has been certified by an accreditation service and provided with a trustmark.
The plugin determines that the accreditation and trustmark are provided by a known entity with a validation mechanism in place. The plugin queries the accreditation service provider by submitting the assertion of the Web site's accreditation.
The accreditation service provider validates the assertion of accreditation and provides a graphic file containing a trustmark. The plugin displays the trustmark to Joseph along with a visual indication that the Web site has a good reputation.
This use case refers to the following terms.
A government department (regulator) is responsible for overseeing the accessibility of all Web sites produced by different levels of government: national and local. It has approved a number of private-sector companies to carry out accessibility evaluation work but there is a need for some mechanism for the evaluators to label the sites in a reliable and machine-readable way, to allow automated monitoring.
The evaluators provide Web sites with labels. Web site administrators embed links to the labels in their pages. A robot used by the government department regularly crawls the Web sites and reads the labels, and checks the data for authenticity and validity using a web service provided by a third-party Authentication Service. A report generator produces progress reports and violation notices based on the information from the data.
The Example Trustmark Scheme reviews online traders, providing a trustmark for those that meet a set of published criteria. The scheme operator wishes to make its trustmark available as machine-readable code as well as a graphic so that content aggregators, search engines and end-user tools can recognize and process them in some way.
The trustmark operator maintains a database of sites it has approved and makes this available in two ways:
Firstly, the trusted site includes a link to the database. A user agent visiting the site detects and follows the link to the trustmark scheme's database from which it can extract the description of the particular site in real time.
Secondly, the scheme operator makes the full database available in a single file for download and processing offline.
Since the actual data comes directly from the trustmark scheme operator, it is not open to corruption by the online trader and can therefore be considered trustworthy to a large degree. To reduce the risk of spoofing, however, the data is digitally signed.
Mrs. Bryanton teaches 8 year olds at her local school. An IT enthusiast, she makes her teaching materials available through her personal Web site. She adds a Description Resource to her material that declares the subject matter and curriculum area.
In order to gain wider trust in her work she submits her site for review by her local education authority. That body then publishes its own Description Resource that declares Mrs Bryanton's Description Resource to be accurate.
Motivates: 3.1.1 Making Assertions; 3.1.2 The Role of a Description Resource; 3.1.3 Grouping; 3.1.4 Composite Assertions; 3.1.5 Multiple DRs; 3.1.6 Independence; 3.1.7 Attribution; 3.1.8 Reference; 3.1.9 Standard Vocabularies; 3.1.10 Identity; 3.1.11 Unambiguous; 3.2.1 Authentication; 3.2.2 Separation of Description and Resource; 3.2.4 Link to Test Results; 3.2.5 Bulk Data Transfer; 3.3.1 Machine-Readable; 3.3.2 Formal Grammar; 3.3.3 Human Readable; 3.3.4 Images.
The use cases in this section highlight the potential benefit of Description Resources to content aggregators, search engines and related services.
Note, sections 2.3 and 2.4 in the previous version of this document have been rearranged but the content is the same except for use case 2.3.1 which is new.
Raman in Bangalore publishes a globally popular sports website that features Soccer, Gridiron, Gaelic football, and Australian rules football channels. Raman wants to ensure that when users around the world search for 'football' that the appropriate channel is included in the search results, based on their country location. As such he publishes an accurate description of each channel for search engines to process.
Gautam in London likes to keep up to date with sport. He enters the search term 'Football news' into a search engine which, based on his location, gives priority to those resources in its index that have metadata explicitly describing the content as being about the game of soccer. As such Raman's soccer channel is included in the results.
Bill in Silicon Valley also enters the search term 'Football news' into a search engine which, based on his location, gives priority to those resources in its index that have metadata explicitly describing the content as being about the game of Gridiron; and Raman's Gridiron channel is included in the results.
Fred operates an antiracism education site which aggregates and curates content from around the Web. Fred wants to label the resources that he aggregates such that educational and other institutions may harvest the resources and associated commentary and metadata automatically for reuse within their instructional support systems, etc.
One of the ways in which Fred wants to curate resources is to say about them that they are pedagogically useful but politically noxious. For example, some sites on the Web make claims about Martin Luther King, Jr that are motivated by a racist ideology and are historically indefensible. Fred's vocabulary allows him to claim that such resources are pedagogically useful for purposes of analysis, but that they are otherwise suspicious and should only be consumed by students in an age-appropriate manner or with appropriate supervision, etc. In other words, Fred needs to be able to make sharply divergent claims about resources: (1) that they are noteworthy, and (2) that they are, from his perspective, dangerous or noxious or troublesome.
The social book-marking site tags.r.us allows their users to tag any resource and so provides a service through which people can annotate both their own and others' resources.
Anders, a zoologist and tags.r.us user, finds a website about the dahut, an allegedly undescribed animal that lives in the French Alps. Anders wants to make sure that it is understood by readers that this is a fictional character, but also that it is interesting to understand the full spectrum of cryptozoological thinking, and thus tags it "fictional".
The word "fictional" is not very useful without context, so to enable such user-defined tags to be shared with others, tags.r.us allows users to assign a link between their own tags and a Description Resource, that provides the context that it is about an alleged fictional animal. An agent can thus use the tag as appropriate, processing the explicit semantics provided by the DR but perhaps presenting other users with Anders' original tags.
This use case is very similar to the previous one. The important difference between them being that in 2.3.2 the description is made using only explicit semantics. In this use case, user-defined tags (i.e. free text) are used but these are then associated with a semantically explicit description. In both cases, the opinions expressed are relatively complex so that the semantics are critical. Furthermore, these are also scenarios where there is unlikely to be any relationship between the content provider and the individual describing the content.
Dave Cook's Web site offers reviews of children's films and the site is summarized in both RSS and ATOM feeds. Most of the films reviewed have an MPAA rating of G and/or British Board of Film Classification rating of U. This is declared in a rating for the channel as a whole. However, Dave includes reviews of some films rated PG-13 or 12 respectively which is declared at the item level and overrides the channel level metadata.
The actual rating information comes from an online service operated by the relevant film classification board itself and is identified using a URI and human-readable text. The movie itself is identified by either an ISAN number or the relevant Internet Movie Database entry ID number. Trust is implicit given the source of the data, which is indicated by a link to Dave's site's policy.
Separately, Fred combines Dave Cook's and other review feeds to provide alternative reviews of the movies by transforming the ATOM feeds into RDF and creating an aggregate view using SPARQL queries.
Motivates: 3.1.1 Making Assertions; 3.1.2 The Role of a Description Resource; 3.1.3 Grouping; 3.1.5 Multiple DRs; 3.1.6 Independence; 3.1.7 Attribution; 3.1.8 Reference; 3.1.9 Standard Vocabularies; 3.1.10 Identity; 3.1.11 Unambiguous; 3.2.1 Authentication; 3.2.2 Separation of Description and Resource; 3.2.3 Default Description; 3.2.5 Bulk Data Transfer; 3.3.1 Machine-Readable; 3.3.2 Formal Grammar; 3.3.3 Human Readable; 3.3.5 User-Generated Tags.
A company named Advance Medical Inc. reviews medical literature on the Web based on a range of quality criteria such as the qualifications of the author(s), the methodology used and the research evidence presented. The criteria may be changed according to current scientific and professional developments. The review process leads to medical literature being classified in two ways:
Quality of Content
Level A: Excellent
Level B: Good
Level C: Acceptable
Peer Review
Level A: Content has been subjected to peer review
Level B: Content has not been subject to peer review
The Quality of Content classification is scalar. i.e. meeting the criteria for Level A implies also meeting Level B which in turn implies meeting Level C. In contrast, meeting Level A for Peer Review does not imply meeting Level B.
The company produces data that declares the classification levels and provides a summary of each document it has reviewed. The data is stored in a metadata repository which can be accessed via the Web.
M.D. Smith uses the data in the repository to make decisions about heath care for specific clinical circumstances.
Motivates: 3.1.1 Making Assertions; 3.1.2 The Role of a Description Resource; 3.1.3 Grouping; 3.1.4 Composite Assertions; 3.1.5 Multiple DRs; 3.1.6 Independence; 3.1.7 Attribution; 3.1.9 Standard Vocabularies; 3.1.10 Identity; 3.1.11 Unambiguous; 3.2.1 Authentication; 3.2.2 Separation of Description and Resource; 3.2.4 Link to Test Results; 3.2.5 Bulk Data Transfer; 3.3.1 Machine-Readable; 3.3.2 Formal Grammar; 3.3.3 Human Readable; 3.3.4 Images
VLCC, (the Very Large Content Company) offers millions of items of content which are delivered through a variety of branded channels. Its strict editorial policies dictate that before publication, all content is reviewed by a member of the editorial team who checks for compliance with those policies. This is encoded in a description covering all its brands that states "VLCC works to ensure that all its content meets W3C Web Accessibility Initiative level AA and is suitable for all audiences unless otherwise stated. If you find any of our content does not meet these standards, please contact us."
The editor is responsible for adding two further descriptions:
Motivates: 3.1.1 Making Assertions; 3.1.2 The Role of a Description Resource; 3.1.3 Grouping; 3.1.4 Composite Assertions; 3.1.5 Multiple DRs; 3.1.7 Attribution; 3.1.8 Reference; 3.1.9 Standard Vocabularies; 3.1.10 Identity; 3.1.11 Unambiguous; 3.2.1 Authentication; 3.2.2 Separation of Description and Resource; 3.2.3 Default Description; 3.2.4 Link to Test Results; 3.3.1 Machine-Readable; 3.3.2 Formal Grammar; 3.3.3 Human Readable; 3.3.5 User-Generated Tags.
The following requirements are derived from the preceding use cases. They have been assigned to thematic groups as an aid to readability.
It must be possible for both resource creators and third parties to make assertions about information resources.
A Description Resource, DR, must be able to describe aspects of a group of information resources using terms chosen from different vocabularies. Such vocabularies might include, but are not limited to, those that describe a resource's subject matter, its suitability for children, its conformance with accessibility guidelines and/or Mobile Web Best Practice, its scientific accuracy and the editorial policy applied to its creation.
It must be possible to define sets of resources and have DRs refer to those sets. For example, DRs can refer to all the pages of a Web site, defined sections of a Web site, or all resources on multiple Web sites.
DRs must support a single composite assertion taking the place of a number of other assertions. For example, WAI AAA can be defined as WAI AA [WAI] plus a series of detailed descriptors. Other examples include mobileOK and age-based classifications from a named authority.
It must be possible for more than one DR to refer to the same resource or group of resources.
Furthermore, it must be possible for a resource to refer to one or more DRs. It follows that there must be a linking mechanism between content and descriptions.
DRs must be able to point to any resource(s) independently of those resources.
A DR must include assertions about itself using appropriate vocabularies. As a minimum, a DR must have data describing who created it. Good practice would be to declare its period of validity, how to provide feedback about it, who last verified it and when etc.
It must be possible for a DR to refer to other DRs or other sources of data that support the claims and assertions made.
There must be standard vocabularies for assertions about DRs.
DRs, their components and individual assertions should have unique and unambiguous identifiers.
Assertions within DRs should be made using descriptors that themselves have unique identifiers.
It must be possible for DRs to be authenticated.
It must be possible to create and edit DRs without modifying the resources they describe
It must be possible to identify a default DR for a group of resources and provide an override at specific locations within the scope of the DR.
It must be possible to link DRs with specific test results that support the claims made.
It must be possible for a data provider to make its repository of Description Resources available as a bulk download.
It must be possible to express DRs in a machine-readable way.
The machine-readable form of a DR must be defined by a formal grammar.
DRs must provide support for a human readable summary of the claims it contains.
It must be possible to associate DRs with images.
It must be possible to encode user-generated tags in DRs.
The editor acknowledges the contributions of members of the POWDER WG and the WCL-XG in compiling this document. In particular Dan Appelquist, Dave Rooks, Pantelis Nasikas, Kjetil Kjernsmo, Kai-Dietrich Scheppe, Kendall Clark, Jo Rabin, Kevin Smith, Alan Chuter, Zeph Harben, Liddy Nevile and Diana Pentecost.
The following changes have been made since the previous version of this document was published.