!ENTITY draft.day "17"> <!ENTITY draft.month "08"> <!ENTITY draft.monthname "August"> <!ENTITY draft.year "2006"> <!ENTITY iso6.doc.date "&draft.year;-&draft.month;-&draft.day;"> <!ENTITY http-ident "http://www.w3.org/2001/tag/doc/URNsAndRegistries-50"> <!ENTITY nri "myRI"> <!ENTITY nris "myRIs"> <!ENTITY an "a"> <!ENTITY An "A"> ]>

URNs, Namespaces and Registries &http-ident;-&iso6.doc.date; [Editor's Draft] TAG Finding CVS $Id: URNsAndRegistries-50.xml,v 1.21 2006/08/17 19:23:58 dorchard Exp $ &http-ident;-&iso6.doc.date; XML &http-ident; Henry S. Thompson University of Edinburgh [email protected] David Orchard BEA Systems [email protected]

This finding addresses the questions "When should URNs or URIs with novel URI schemes be used to name information resources for the Web?" and "Should registries be provided for such identifiers?". The answers given are "Rarely if ever" and "Probably not". Common arguments in favor of such novel naming schemas are examined, and their properties compared with those of the existing http: URI scheme.

Three case studies are then presented, illustrating how the http: URI scheme can be used to achieve many of the stated requirements for new URI schemes.

HST 2006-03-14 Further to a request from Roy Fielding, I had a brief look at XCAP, seems to be using http: URIs now, although it introduces a new Application UID registry, and uses ietf: URNs for its namespaces. . . If anyone (including Roy) remembers what Roy was particularly concerned at here, please let me know.

This document has been produced by the W3C Technical Architecture Group (TAG). This finding addresses TAG issue URNsAndRegistries-50.

This is the third draft of this finding, with the first section complete and adding three case studies. This finding is an editorial draft, not yet accepted by the TAG.

Additional TAG findings, both accepted and in draft state, may also be available. The TAG expects to incorporate this and other findings into [what?] that will be published according to the process of the W3C Recommendation Track.

HST 2005-03-29 Are we ready to tell the world what will follow AWWW?

Please send comments on this finding to the publicly archived TAG mailing list [email protected] (archive).

Edinburgh et al.: World-Wide Web Consortium, Draft TAG Finding, 2005.

Created in electronic form.

English Complete section 2 and edit DO's new case studies. 2006-04-03: Further work, including some in response to DO's message of 23 March 2006-03-14: Return to this following discussion at f2f 2005-04-12: add DO as editor 2005-04-11: Fold in DO's comments 2005-04-05: Second draft: More on XRIs 2005-03-29: First internal draft
Introduction

In we find the following recommendations:

"A URI owner SHOULD NOT associate arbitrarily different URIs with the same resource."

"A specification SHOULD reuse an existing URI scheme (rather than create a new one) when it provides the desired properties of identifiers and their relation to resources."

"Agents making use of URIs SHOULD NOT attempt to infer properties of the referenced resource."

"A URI owner SHOULD provide representations of the resource it identifies."

Recently, however, a number of proposals have emerged to create new identification mechanisms for the Web. They propose new URN (sub-)namespaces or URI schemes and provide registries for instances thereof, in order to allow them to be used to identify and retrieve information resources. This would appear to be incompatible with 's simple positive recommendations. In this finding we enumerate the arguments given in favor of these new proposals, which often turn out to be arguments against using http: URIs, and explain why they are mistaken and how the above principles can be understood to point the way constructively to alternative designs which do in fact make use of http: URIs.

Examining the need for new approaches to naming information resources

This section is structured in terms of goals or requirements for resource identification mechanisms which have been offered as justifications for adopting a new approach. They are drawn from a number of recent proposals (, , , , ) abstracting, merging and summarizing them. Although we will examine some of these proposals in specific detail in the three cases studies below, in this section we will use the name myRI as a cover term for this general class of proposed alternatives to http:, both those proposing new URI schemes and those proposing new URN sub-schemes. In each case we state a requirement and examine the extent to which the existing http:-based identifier mechanism addresses it.

Persistence

The relation between &nris; and the information resource they identify should persist indefinitely.

Or, more realistically, that individual &nris; should manifest syntactically whether or not they are intended to persist indefinitely.

This goal is difficult to get to grips with, as it appears to mean different things in different contexts:

At its simplest, this is just a wish for an end to 404 Not Found, i.e. that you should always be able to resolve &an; &nri;.

In the Information Science community, 'persistence' is a stronger requirement, namely, that what you get when you resolve &an; &nri; should never change.

http: URIs support persistence as well as it is in-practice possible to do so.

As has been frequently observed, achieving either of the numbered types of persistence above is not a technology issue, it's a management issue. It's up to the owners and operators of the mechanisms which implement &nri; resolution to enforce whatever degree of persistence they choose. It follows that there is no difference here between &nri; and http:.

What of the more sophisticated reading, that &an; &nri; should manifest its minter's intentions with respect to persistence? That's just a matter of naming conventions, and perfectly possible using http:. We could, for example, say that all versionable/time-varying resources on our site are named with all lower-case letters, and all persistent/stable/non-varying resources are named with all upper-case letters.

Standardized

&nris; should be susceptible to standardization within administrative units

This goal appears to be directed at guaranteeing certain invariants, for example with respect to the structure of identifiers and the availability of the resources they identify. This means they should not be creatable in a distributed or unsupervised fashion.

Again, this is largely a management issue, not a technical one. Whatever invariants are in view can as well be enforced on (sub-parts of) http:-served resource collections as on those identified via &nris;.

Nothing in a specification can stop people from uttering URIs of any kind. Domain names are as good, or as bad, at conveying ownership of a particular form of URI as URN namespaces or URI schemes.

Centralized authorities can be established for parts of domain space as easily as for areas "off the web", and enforcement mechanisms can be as effective. For example, my employers constrain the mechanisms by which web pages are accepted for serving from certain parts of their domain so as to enforce invariants both of path structure and content markup.

Protocol Independence

Access to resources identified by &nris; should not be dependent on any particular protocol.

Exactly what this means is not clear -- although it is listed as a requirement in several cases, there is little or no discussion, so exactly why it should be a requirement for &nris; is not clear.

http: URIs are no more protocol-dependent than any other identification mechanism.

For pure naming, that is, if retrieval is never intended, http: is as good as any &nri; approach, because no protocol at all is involved. If retrieval is anticipated, then any &nri; approach must specify a mapping to one or more protocols. All existing &nri; approaches in practice specify only one such mapping, to the HTTP protocol. So they are in exactly the same position as http: -- if for some reason in the future the HTTP protocol becomes unavailable or inappropriate, both &nris; and http: will have to specify a new mapping.

True protocol independence is difficult to imagine in practice, as many protocols depend on a tight coupling between message formats and client/server application models. Protocols which don't allow servers any escape mechanism are thereby pretty much ruled out as transports for retrieval from &nris; (or http: URIs).

It's appropriate to note here that in cases where the necessary form of client/server interaction for a particular kind of information resource, for example streaming video, cannot be provided by the protocols normally associated with existing URI schemes, new schemes may be appropriate. Detailed discussion of this point can be found in . But none of the &nri; proposals are for resources of this kind.

Location Independence

&nris; should not be locations.

Practical realities and administrative changes will always defeat any attempt to guarantee that the representation of a particular resource will always be stored in exactly the same host/server/filestore/directory/file. Any naming mechanism which equates locations in that sense with names is by construction inadequate. It follows that this goal is a sensible one.

http: URIs are not locations.

Misunderstanding of http: URIs as locations has a long and, in part, justifiable history (they were, after all, originally called Uniform Resource Locators). But it's not longer justifiable either in principle (the RFC for URIs is quite clear on the subject) or in practice (there's lots of software support for server-side management of the relationship between http: URIs and their representations). See for example the classic for a more detailed discussion of these points.

Structured names

&nris; should provide for structuring resource identifiers with shareable tags

This requirement has only been suggested by the authors of . It amounts to a wish to structure resource names using name/value pairs, with the names having some standardized, widely understood meaning. This requirement is related to requirements appealed to in the design of End Point References , .

The query component of http: URIs supports non-hierarchical structured naming.

It is open to any naming authority to establish conventions for the use of the query component of http: URIs under its control. Since the query component is already structured in terms of simple name/value pairs, it is a good fit for the requirement.

Uniform access to metadata

&nris; should provide as well for access to metadata about as to representations of a resource.

Several &nri; proposals establish a constructive relation between the &nri; for a resource and the &nri; for metadata about that resource.

Naming conventions or response headers can provide this already

Naming authorities can impose such constraints on the http: URIs under their control. Alternatively, and particularly where it is appropriate to allow for meta-metadata, etc., the Link: response header may provide equivalent functionality in a more extensible way.

Flexible Authority

&nris; require different approaches to identifying namespace authorities, in some cases simpler and in others richer than that provided by hierarchical domain names administered by IANA and resolved via DNS.

http: URIs can encode arbitrarily complex (or simple) namespace authority expressions.

Complex encodings of dependent and delegated naming authority can be implemented using proxies and redirection. In the other direction, proper management of domain names for http: URIs can produce names which are very little different from the equivalent &nri; (compare e.g. http://lccn.info/2002022641 to info:lccn/2002022641), while gaining all http:'s benefits of scalability and installed base.

HST 2006-06-06 HST now owns lccn.info and oclcnum.info, will sell to Stuart Weibel for a modest consideration :-)
The value of http: URIs

The http: URI scheme implements a two-part approach to identifying resources. It combines a universal distributed naming scheme for owners of resources with a hierarchical syntax for distinguishing resources which share the same owner. Widely available mechanisms (DNS and web servers, respectively) exist to support the use of http: URIs to not only identify but actually retrieve representations of information resources.

Any requirement for naming resources, particularly if not only naming but also retrieval of representations is in prospect, which admits to a similar decomposition, that is, into a universal owner name and a hierarchical owner-relative name, can almost certainly be satisfied by the http: URI scheme. http: provides substantial benefits, in terms of installed software base, user comprehension, scalability and, if required, security, at very low cost.

Anyone developing an alternative approach, that is, some form of &nri;, should consider carefully whether that approach is either isomorphic to http:, or makes covert appeal to http: for its implementation. In either case, this strongly suggests that the fundamental requirements of the new approach do in fact admit to the two-part description given above, and therefore that http: itself would be a viable, and therefore a preferred, way forward.

The example in section above is illustrative of the benefits that the ubiquitity of the installed base of support for http: provide -- within 15 minutes of registering the lccn.info domain, the http://lccn.info/ homepage had been put in place and was available to anyone with a web browser and access to the Web.

Case study: Naming namespaces

In this section we look in detail into some of the background assumptions for the utility of &nris; for one particular purpose, namely for naming namespaces. We will compare the use of http: and of &nris; for this purpose.

Context

The XML Namespaces specification is the context-defining specification for namespace names. It specifies that namespace names are for use in expanded names consisting of a namespace name (or no value) plus a local name. An expanded name may be compared against other expanded names. Very common scenarios are for performing well-formedness checking and for content model validation. The namespace specification, roughly speaking, says that a namespace name cannot be assumed to be dereferenceable. Any software component that is written assuming that any namespace name must be dereferencable is violating the namespace specification. It may be that the namespace owner has guaranteed that they will provide a document at the namespace name, but namespace owners are not required to do so and not all do. As a result of this, generic XML software should not be written to assume dereferencability of namespace names.

Any use of identifiers, from namespace names to isbn numbers to invoices, requires a context. The context will define the use of the identifier and includes social and technical context. A URI on the side of a bus will probably convey the social meaning that it can be typed into a browser. Other contexts for the use of URIs include namespace names, references to documents, and identifiers for things. It is never the case that a URI is simply "found" without a context.

Identification

First we examine the use of an http: URI for a namespace name. We will choose the OASIS WS-RM TC's HTTP Namespace name as an example.

Namespace with http: scheme <![CDATA[]]>

Compare this with the use of a urn: URI for the namespace name. We will use the OASIS UBL rules for namespace names. The UBL rules are roughly that the namespace names for UBL Schemas holding OASIS Standard status must be of the form: urn:oasis:names:specification:ubl:schema:<subtype>:<document-id>. For example, the first namespace name for the first major release of the Invoice document has the form urn:oasis:names:tc:ubl:schema:xsd:Invoice-1.0, such as:

Namespace with urn: scheme <![CDATA[]]>

In all XML namespace software, both approaches work correctly. The software-only interaction pattern is clearly erroneous if it assumes that a namespace name is dereferenceable, and it is unlikely that XML software written today requires this assumption be valid.

Persistence of Identifiers

Let us know examine the persistence of the identifiers. The oasis URN namespace, as used in urn:oasis:names:tc:ubl:schema:xsd:Invoice-1.0, is assigned by the OASIS organization and registered with IANA. OASIS has the authority to change its identifying scheme, subject to IANA review. Additionally, the actual names are decided by OASIS. As with all URNs, the persistence of any particular identifier and scheme are up to the registering organization.

An http: URI for OASIS namespace names, such as http://docs.oasis-open.org/ws-rx/wsrm/200602, is assigned by the OASIS organization because it owns the oasis-open.org domain. They do not have to register the complete URI anywhere. OASIS has the authority to change the template on it's own, without any review. Additionally, the actual URIs are decided by OASIS. As with all URIs, the persistence of any particular identifier and schema are up to the owner of the domain. It is possible for the domain to cease being owned by OASIS, through lack of maintenance or even error.

We might imagine a scenario many years down the road where OASIS no longer exists. It would not maintain the oasis-open.org domain name and http: identifiers using that domain would no longer be assigned. Alternatively, OASIS does not produce or mint any new URNs. In either case, the identifiers are not dereferenced so all the existing software works.

In URN and http: scheme cases, the persistence of the identifier is accomplished by the organization. The ongoing existence of the organization does not affect the persistence of the identifiers.

Dereferencability

But, if one of these identifiers appears in a document, how will a human find out the meaning? One approach is examine the context surrounding the identifier, in this case the XML document and the Namespaces specification. They will look in the XML Namespace specification and see what it says about namespaces. There is no benefit to the xri: versus http: as the work in examining the XML document and XML namespace specifications are the same. Alternatively, they may try to dereference the namespace name, but it's not deferenceable so they get no information.

It is natural for a human reading an XML document with an unknown namespace name to want to understand more about the namespace. This is why recommends providing a document at a namespace name that provides both human and machine readable information. The use of http: namespace names enables 3 separate scenarios:

an identifier can be created in a decentralized manner;

an identifier may be dereferenced by a person via a browser to aid understanding;

an identifier may be dereferenced by a computer and exploited for automatic processing by reason of its identifying schemas, WSDLs, policies, etc.

These are two distinct interaction patterns, without and with human involvement.

In all dereferencable identifier scenarios, an identifier must be usable to generate an authority. There may be interactions with multiple authorities to determine the "final" authority for the identifier. The final authority uses the identifier to produce a document.

In the http: identifiers, the authority is specified immediately after the scheme. The authority system in http: URIs is the internet's DNS and IP systems. One or more DNS authorities produces an IP destination as the final authority. That authority is then sent the remaining part of the URI for dereferencing. In the case of http://docs.oasis-open.org/ws-rx/wsrm/200602, the HTTP interaction is

HTTP GET of namespace name <![CDATA[GET /ws-rx/wsrm/200602 HTTP/1.1 Host: docs.oasis-open.org]]>
Erroneous appearance of dereferencability of identifiers

A common reason given for needing &nris; for namespace names is that an http: identifier appears to humans as a location and hence dereferencable. The argument that http: URIs are "locations" is based upon incomplete understanding of the use of URIs. A classic scenario is that a human looks at an XML document using a word-processing application, and the application formats the value of an xmlns attribute as a hyperlink, because it is an http: URI, say http://example.org/ns/foo. But as there is no document dereferencable from that URI, when the user clicks on the link an HTTP 404 will be returned. The obvious downside is that the user has wasted some time, typically around 5-10 seconds. There is no additional harm than that in clicking and getting a 404.

Under what circumstances are identifiers viewed as "clickable", that is what are the contexts? In this document, neither of the xmlns links have shown up as clickable. When these documents were pasted into an email, they were not converted to clickable. The http: link was converted to clickable only when the myns attribute was typed by hand and auto-complete was on. It was the e-mail program's "auto-complete" that saw an http: within a pair of quotes and made it clickable. It also required that rich text or HTML formatting is selected in creator and receiver/viewer. When viewed in plain text, the link is not clickable. The clickable link arises when a document is typed by hand, with auto-complete turned on, and then viewed by with HTML formatting. Neither of these applications is treating the document as XML, rather they are treating it as HTML. In particular, none of the applications know anything about XML or the xmlns attribute. Thus the context of usage has incorrectly view the xmlns attribute as HTML. When this happens, that is people are reading and writing sample XML documents using HTML formatting, the worst downside is that a person may waste 5-10 seconds.

Contrasting with this is the approach of using &an; &nri;. &An; &nri; provides an identifier. A human looking at an xml document with &an; &nri; namespace name will not be confused about whether it is dereferencable or not. No software will "auto-complete" e.g. an xri:... identifier into a clickable link. The 5-10 seconds of potentially wasted time are avoided.

Summary

Namespace names are just one example of a context of use. Any use of an identifier, or any datatype for that matter, in an XML document has the same issues. A provider of a identifier must specify how the identifier will be used in each specific sub-context of their XML language, whether it is intended as an identifier, a location, or both. Using &an; &nri; instead of an http: URI does not make the software or human's job any easier.

This section has shown that http: uris have a large benefit over urns: when used as namespace names because the namespace name could optionally be dereferencable, and the only downside is a fairly minimal amount of wasted time when a namespace name appears dereferencable but isn't.

Case Study: XRI

In this section we look in some detail into some of the background assumptions for the utility of &nris; for persistent identifiers and location independence, and we will compare http: with the proposed xri: scheme in this regard.

Simple Document Retrieval Technical Analysis

This sections provides an overview of document retrieval of http: versus XRI: For comparison purposes, it shows document retrieval for a given URI and for a given XRI. Consider the worked example from , in which a department of a government agency published a document named govdoc.pdf. It assigns a URI http://department.agency.example.org/docs/govdoc.pdf. observes that changing the organizational structure represented in the URI, for example to http://newdept.agency.example.org/docs/govdoc.pdf, or the path structure, for example to http://newdept.agency.example.org/documents/govdoc.pdf, breaks access.

suggests that an xri: URI for the same resource can be designed to be location independent, for example xri://@example.org*agency*department/docs/govdoc.pdf deals with delegation by using stars ("*"). Another solution advised by is to use identifiers that have bang ("!") symbols to indicate persistence. An example is xri://@!9990!AF8F!1C3D/!2495. We examine this scenarios in order.

Run-time Resolution

A client makes an HTTP request for the document at http://department.agency.example.org/docs/govdoc.pdf

HTTP GET of document <![CDATA[GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 200 OK PDF Document]]>

specifies that "XRI resolution is a two phase process. The first phase, authority resolution, resolves to the XRI authority responsible for the resource. The second phase, local access, uses URIs and metadata from the authority to interact with the identified resource." specifies that a xri://@example.org*agency*department/docs/govdoc.pdf is parsed to an XRI Authority of @, which is queried for example.org, which is queried for *agency, which is queried for *department. specifies that @represents an authority of type organization and it establishes a global context for identifiers for whom the authority is controlled by an organization or a resource in an organizational context, resulting in http://example.org ss the base authority for @example.org. It is possible to do look-ahead as well, so a query of @example.org*agency*department might return resolution for @example.org, @example.org*agency, or @example.org*agency*department. Resolution proceeds until a "/" is reached in the XRI. The "=" represents an authority of type Person and it establishes a global context for identifiers for whom the authority is controlled by an individual person - resulting in equals.example.org is the authority to send the resolution request. The XRI Authority endpoints are described using XRI Descriptors. There are other special characters, such as "!", "+", "$".

Note:DaveO can't find where @ is resolved by local authority, how @example.org maps to http://example.org. Would @foo.ca map to http://foo.ca? There is some wording in XRI Resolution that says = examples resolves to http://equals.example.org/xri-resolve as found in the xrid:XRIDescriptor/xrid:Authority/xrid:uRI for this community, but I'm not sure what that means. Somehow the @ and = authorities have to be built in, but I don't know if it's an HTTP GET or a default XRIDescriptor or .. So, I don't know how this bootstrap problem is resolved.

HTTP GET to XRI resolver <![CDATA[GET *example.org*agency*department HTTP/1.1 Host: example.org Accept: application/xrid+xml response: 200 OK http://department.agency.example.org ]]>

The XRIDescriptor's element specifies that the authority for *agency*department is department.agency.example.org. An HTTP GET request is issued

HTTP GET to XRIDescriptor's URI <![CDATA[GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 200 OK PDF Document]]>

There is the obvious bootstrap issue in the XRI system. Any XRI client must understand the XRI descriptor format. This is effectively a replacement for DNS, that is mapping names to addresses. Note that it recurses and uses the DNS/HTTP infrastructure in this example. There are at least 2 separate HTTP GET requests to resolve the xri: identifier into a document.

Persistent Dereferencability (location independence)

Another common reason for a new identifier scheme is to come up with an identifier that is location-independent or "movable" from one location to another. The idea is that the document changes location but the identifier should still resolve to the same document. In all cases, there must be some kind of mapping of the identifier to the "new" location if a location is changed. There is a publishing step, where the "new" location is added into the registry for the identifier.

Run-time resolution

HTTP supports movement through various 3xx status codes. Virtually all Web browsers and servers will correctly utilize the 3xx HTTP Status codes.

HTTP GET of document <![CDATA[GET /docs/govdoc.pdf HTTP/1.1 Host: department.agency.example.org response: 301 Moved Permanently Location: newdept.agency.example.org/documents/govdoc.pdf GET /documents/govdoc.pdf HTTP/1.1 Host: newdept.agency.example.org response: 200 OK PDF Document]]>

XRI supports movement through modifying the XRIDescriptor's URI element

HTTP GET to XRI resolver <![CDATA[GET *example.org*agency*department HTTP/1.1 Host: example.org Accept: application/xrid+xml response: 200 OK http://newdept.agency.example.org ]]>

The XRIDescriptor's URI element specifies that the authority for *agency*department is newdept.agency.example.org. An HTTP GET request is issued

HTTP GET to XRIDescriptor's URI <![CDATA[GET /docs/govdoc.pdf HTTP/1.1 Host: newdept.agency.example.org response: 200 OK PDF Document]]>

NOTE: DaveO: I can't see in the XRI Resolver specs how the docs path is changed to documents.

With http: URIs, there is a dependency that the original URI cannot be re-used for some other purpose and that it must remain "viable", that is it can't be terminated. If department.agency.example.org ever disappeared, all the clients would break on that document. With XRI identifiers, there is a dependency upon the @ authority and related resolvers. The "long-term" viability question then is whether XRI resolvers will "last" longer than HTTP Servers on given domain names.

Configuration

There are two steps to making the document available at the new URI. Firstly, the Web server must be configured to do the 301 and new Location (the redirect). In Apache 2.2, this is a configuration line such as Redirect /service http://foo2.example.com/service. Secondly, the document must be made available at http://newdept.agency.example.org/documents/govdoc.pdf.

XRI allows a simpler retrieval once the authority is known by removing the redirect step. The authority maps the identifier to the new document and retrieval of ns/foo is avoided. The change process is simpler as well. The new address is an XML entry instead of a registry must be updated and the new ns/latest/foo must be added to the system.

Now the question is about the relative difficulties in updating an HTTP Server or to update an XRI resolver. In either case there will be some kind of submission and approval process. There is widespread deployment of HTTP and HTTP Administrators, it seems likely that the configuration change (one line redirect in Apache) is probably roughly equivalent to updating XRI descriptor element at a URI.

In both solutions, the mappings from old identifiers to new identifiers are stored. XRI calls this out as "However such an approach would eventually lead to a spaghetti code of new-to-old XRI mappings. It also has the drawback of preventing reassignment of the identifier "department" for another purpose."

Persistence Identifiers

The persistence of an identifier has operational and expressive characteristics.

Operational policy

Let us return to the persistence of the identifiers without regards to derefencability. The xri: scheme is managed by the XRI committee within OASIS. OASIS has the authority to change the XRI scheme, probably at the request of the XRI committee. It is possible for the XRI scheme to cease being maintained by the XRI committee, and even some other committee or organization take it over. Should another organization choose to create an alternative XRI scheme, then there would probably be a dispute perhaps with a dispute resolution mechanism. As the XRI specification says "As with URNs, the issue of whether a persistent sub-segment is in fact permanent (never reassigned) is a matter of operational policy for the assigning authority. XRIs can't help with the operational issue ..."

An http:-based naming scheme for documents is managed by the organization that owns the domain in the URI. The organization does not have to register new schemes anywhere. In the case of http://www.oasis-open.org URIs, OASIS has the authority to assign and change the URIs or URI schemes on its own, without any review. It is possible for the domain to cease being owned by OASIS, through lack of maintenance or even error. Alternatively, the XRI community could use an http: based URI, such as http://xri.net. Then the owner of the xri.net domain, presumably the XRI TC, would be responsible for managing the domain. As with all names, the domain could lapse. And as with XRI identifiers, there is the possibility of conflicts and disputes.

With all identifiers, the persistence of any particular identifier and scheme are up to the registering organization and the registration authority. http: and XRI: are equivalent in the operational policies determining persistence.

Intent

Both schemes can express the intent of persistence. As the XRI specification says "XRIs can't help with the operational issue, but XRI syntax allows the authority to express its intent". XRI uses the bang ("!") symbol to indicate that the identifier that follows is persistent. XRI uses the star ("*") symbol to indicate that the identifier that follows is re-assignable. The XRI specification suggest that "a much better solution would be to assign the resource "govdoc.pdf" an identifier that never needs to change or be reassigned ... such as xri://@!9990!AF8F!1C3D/!2495".

It is possible for a URI authority to express its intent in the URIs it mints, so the same identification of persistence versus transience can be done using http: URIs. The http: based XRI design will be shown shortly.

XRI and Web architecture effectively give the same guidance, that it is a much better solution to assign an identifier that never changes, as in .

Protocol Independence

Protocol independence is a goal of XRI. The previous HTTP redirects have shown that HTTP can do redirects, and it can do redirects to non http resources. Starting from HTTP, a redirect to an ftp: resource is possible. An important but subtle aspect of the web architecture is that the http: scheme for identifiers does not require the HTTP protocol to be used. This is explored in .

XRI alternative design

We suggest XRI should create URIs using the http: scheme, rather than inventing a non-URI based scheme. The thread of documented good practices is that leads to this conclusion starts from :

To benefit from and increase the value of the World Wide Web, agents should provide URIs as identifiers for resources.

A specification SHOULD reuse an existing URI scheme (rather than create a new one) when it provides the desired properties of identifiers and their relation to resources.

Intent is part of the potential metadata in URIs discussed in .

URI assignment authorities and the Web servers deployed for them may benefit from an orderly mapping from resource metadata into URIs

The XRI community could define constraints on http: uris containing a particular domain, such as xri.net. The rules for persistence and location independence can be defined. They could start with http://xri.net followed by roughly the current XRI rules and taking into account URI character constraints. For example, the previous XRI persistent identifier could be similar to: http://xri.net/@;9990;AF8F;1C3D/;2495. An alternative common practice for persistent identifiers is using UUIDs in the URI, ie:http://example.org/6B29FC40-CA47-1067-B31D-00DD010662DA. Location independent identifiers can be achieved using HTTP redirects. There are many varieties of constraints upon any URIs and use of HTTP redirects. One generalized framework for mapping XRIs or URNs to http: URIs is at .

Summary

We have shown that http: identifiers for XRIs can achieve the goals of XRI with substation benefits. The XRI goals of persistent identifiers and location independence are already available with http: identifiers. There are two concrete benefits to using XRIs identified in the previous analysis: that users cannot waste time by erroneously dereferencing namespace names that do not have namespace documents, and that an extra HTTP GET request is avoided when documents move. The XRI identifier solution's downsides are adding a new identifier scheme with the software and human costs and seemingly mandatory increased network costs ( our example shows 2 HTTP GETs instead of 1). Given these costs and benefits, deploying a new registrar, resolution mechanism and related software to layer on top of existing web functionality is not justified.

Our analysis has shown that if the scheme definition for xri: says that it is dereferencable, and specifies a mechanism, then either that mechanism is HTTP, or it will have to provide all the functionality, and thus be heir to all the weaknesses, of HTTP. In either case little benefit has been gained over just using the http: scheme itself. Note we have not yet compared the authority resolution mechanisms and the dependence upon centralized authority. We have also not compared the distributed authoring of identifiers either.

In the &nri; identifier scenario, the "location" to be used for knowledge is somewhere in the application or in some property of the &nri; such as a URI scheme or URN (sub)scheme. The &nri; proposals includes means to transform &an; &nri; into a dereferencable address via lookup using a registry server. This in turn requires the use of a dereferencable address for the server, or else all software intended for use with &nris; must have the registry server locations "hard-coded". As far as we can tell, all the &nri; proposals expect the results of server lookup to be an http: URI, and also appear to use an http: URL to identify the location of the registry server.

Case study: new URI scheme with no protocol

A main advantage of http: URIs is the use of DNS to allow decentralized creation of vocabularies. This does bear the cost that humans can be confused by the mixing of location and identifiers. Another possibility is to create and register a scheme that does not have any protocol associated with it but follows all the rest of the http: syntax. This is something like a cross between URNs and http: URIs.

id: scheme <![CDATA[]]>

The intent is clear from the instance of the URI, that HTTP is not to be used for dereferencing. Various programs do not "auto-complete" into clickable links. However, similar to URNs, this does not allow the possibility for the dereferencing the link to retrieve a document. If a human wants to find out about the specific namespace, how do they find out? Returning to context, a human must understand that the myns is an XML namespace and how XML namespaces are used. The use of id: or http: or urn: or xri: has done nothing to shield them from this requirement.

There are very few advantages and significant downside to this approach. For these reasons, David Orchard never proceeded down the registration path for id:.

References Berners-Lee, Tim Cool URIs don't change, W3C, 1998. Available online as http://www.w3.org/Provider/Style/URI. Berners-Lee, T., Fielding, R. and L. Masinter Uniform Resource Identifier (URI): Generic Syntax, IETF, 2005. Available online as http://www.faqs.org/rfcs/rfc3986.html Best, K and N. Walsh A URN Namespace for OASIS, IETF, 2001. Available online as RFC 3121 Bray, T. et al eds. Namespaces in XML 1.1, W3C, 2004. Available online as http://www.w3.org/TR/xml-names11/ Crawford, Mark UBL Naming and Design rules, OASIS. Available online as http://www.oasis-open.org/committees/download.php/10323/cd-UBL-NDR-1.0Rev1c.pdf Booth, David Converting New URI Schemes or URN Sub-Schemes to HTTP. Available online as http://dbooth.org/2006/urn2http/ Fielding, R. et al. eds Hypertext Transfer Protocol -- HTTP/1.1, IETF, 1997, section 19.6.1.2. Available online as http://www.ietf.org/rfc/rfc2068.txt Gudgin, M., Hadley, M. and T. Rogers, eds Web Services Addressing 1.0 - Core ("Endpoint References" section), W3C, 2006. Available online as http://www.w3.org/TR/ws-addr-core/#eprs Hendrikx, F., Wallis, C. and New Zealand Government, A Uniform Resource Name (URN) Formal Namespace for the New Zealand Government, RFC 4350, IETF, 2006. Available online as http://www.ietf.org/rfc/rfc4350.txt Jacobs, Ian and Norman Walsh, eds. Architecture of the World Wide Web, Volume 1, W3C, 2004. Available online as http://www.w3.org/TR/webarch/ Mealling, M. ed. The IETF XML Registry, IETF, 2004. Available online at http://ietfreport.isoc.org/idref/rfc3688/ Reed, D. and D. McAlpin eds. An Introduction to XRIs, OASIS, 2005. Available online as http://www.oasis-open.org/apps/group_public/download.php/11857/xri-intro-V2.0-wd-04.pdf Reed, D. and D. McAlpin, eds. XRI Syntax, OASIS, 2005. Available online as http://docs.oasis-open.org/xri/2.0/specs/xri-syntax-V2.0-cd-02.pdf Van de Sompel, H. et al eds The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces, IETF, 2006. Available online at http://www.ietf.org/rfc/rfc4452.txt Wachob, G. ed XRI Resolution, OASIS, 2005. Available online as http://docs.oasis-open.org/xri/xri/V2.0/xri-resolution-V2.0-cd-01.pdf TAG Request for Change to WS Addressing Core, TAG message to WS Addressing Working Group, 2005. Available online as http://lists.w3.org/Archives/Public/www-tag/2005Oct/0057.html "Relationship of URI schemes to protocols and operations", TAG issue schemeProtocols-49, available online as http://www.w3.org/2001/tag/issues.html#schemeProtocols-49 "The Disposition of Names in an XML Namespace", TAG issue nameSpaceState-48, available online as http://www.w3.org/TR/namespaceState/ "Metadata in URI", TAG issue metadatainuri-31, available online as http://www.w3.org/2001/tag/doc/metaDataInURI-31