W3C Technology and Society Domain The Semantic Web Home Page

Rule Interchange Format
Working Group Charter

This Working Group is chartered to produce a core rule language plus extensions which together allow rules to be translated between rule languages and thus transferred between rule systems. The Working Group will have to balance the needs of a diverse community — including Business Rules and Semantic Web users — specifying extensions for which it can articulate a consensus design and which are sufficiently motivated by use cases.

This work is divided into two phases, and this charter only provides resources for the first phase (up to two years). Upon completion of the first phase, the Director may extend this charter to cover the second phase. (See Note on Duration.)

Contents:

1. Mission

The Working Group is to specify a format for rules, so they can be used across diverse systems. This format (or language) will function as an interlingua into which established and new rule languages can be mapped, allowing rules written for one application to be published, shared, and re-used in other applications and other rule engines.

Because of the great variety in rule languages and rule engine technologies, this common format will take the form of a core language to be used along with a set of standard and non-standard extensions. The Working Group is chartered to first establish the extensible core and possibly a set of extensions, and then (in Phase 2) to begin to specify additional extensions based on user requirements. These extensions need not all be combinable into a single unified language.

This mission is part of W3C's larger goal of enabling the sharing of information in forms suited to machine processing, as seen in several application areas presented at the 2005 W3C Workshop on Rule Languages for Interoperability:

1.1. Usage Scenarios

To help motivate and clarify the scope of this working group, here are three cornerstone scenarios, each illustrating a kind of application which should be supported by the rule exchange infrastructure provided by this work.

Finding New Customers

Jackson is trying to find someone: he needs at least one more client before the end of the quarter. He has access over the web to dozens of databases, some public, some licensed, and some maintained within his company. Together, they contain millions of potential candidates, but how is he going to narrow the field to the five or ten leads he should seriously pursue?

He thinks for a minute, then constructs a new query. He clicks on interesting properties, sees their values, and locks in the ones he thinks will act as useful filters. After a few minutes he gets frustrated, because the same concepts seem to have different names in different databases. Worse, the same idea is sometimes expressed in different structures; one database follows the the model of having "name of assistant" and "phone number of assistant" properties, while another simply has an "assistant" property, which links to another person. Trying to handle structural variations like this in his query is becoming impossible.

Fortunately, Jackson's system supports a rule language. The query construction interface helps him construct mapping rules between different constructs which seem equivalent to him, letting him infer new information that is customized to his needs, so he can query over a virtual unified database with a structure that seems to him to be simple and straightforward.

In fact, these rules were already being used; the data views Jackson saw were in many cases constructed by rules other people had written. His own rules will be available to his department (because he stored them in department workspace), allowing his co-workers to use the unified view he finds so useful.

Validating Prescriptions

Bob goes to his new physician, Dr. Rosen, complaining of a painful cough and some difficulty breathing. The diagnosis of pneumonia is straightforward, and Dr. Rosen prepares to prescribe erythromycin. First, he asks Bob if he is taking any medications. Unfortunately, Bob is not entirely forthcoming: he says no, even though he takes pimozide to help manage Tourette's disorder. The omission seems harmless enough, and Bob is uncomfortable with people knowing about this difficult aspect of his medical history.

Fortunately, Bob uses the same pharmacy for both prescriptions, and his pharmacy checks all prescriptions against a merged, multi-source rule base. This rule base includes the fact that erythromycin is a macrolide antibiotic (coming from the erythromycin vendor) and an encoding of the 1996 FDA bulletin that pimozide is contraindicated with macrolides. When the pharmacist enters the prescription, he is informed of the potentially dangerous drug interaction. He talks to Bob, and with Bob's permission contacts Dr. Rosen to plan an alternative therapy.

The same technology could be made available to doctors, to double check their own knowledge and available references, and to consumers who want to take a greater role in understand their own health care. The key is the ability to efficiently merge rules from multiple sources because we have an interchange language.

Processing Loan Applications

Cory is shopping for a home equity loan. A web search finds a site (loans.example.com) of which Cory has heard and which offers to get him three free quotes. He enters the required information. The application form uses rules that indicated that since his location is in California, he is required by state law to specify whether his application is for home improvement. This "intelligent form" means that he is less likely to be have his application returned for additional information. His application is then dispatched to three lenders. The lenders in turn each add his application to their applicant database where it is subject to matching by their rules.

One lender's system determines a suitable rate and sends Cory an e-mail and paper-mail reply immediately. The second flags the application for review by a loan officer who looks briefly at the data before authorizing the automated offer process to continue. At the third lender, Cory is automatically classified as a highly desirable customer, and a loan officer is flagged to call Cory and personally move the process forward.

The rules in each lender's rule base are in fact based on a combination of their own business rules, rules of their aftermarket loan trading partners, and rules encoding government regulations. Again, this becomes much more practical when based on a common interchange language.

In each case, conventional rules technology is enhanced not only by the usual economies of standardization, but also by the ability to exchange and merge rules from different sources. Particularly in the first scenario, we see the kind of ad hoc data fusion which is the hallmark of the Web, finally being done by machine.

1.2. Compatibility

It is important for the Working Group to reuse and build on existing technologies and standards, even when it makes the design job harder. The greatest challenge in establishing a rule language standard may be the multitude of existing approaches in the marketplace. Interoperation with the most widely deployed technologies will be crucial for obtaining the desired standardization effect.

XML
The Extensible Markup Language (XML) has emerged as the most popular form for data exchange on the web and in many other contexts. XML provides structure, tagging, and in some cases datatype information, but there is no standard mechanism for mapping XML data directly to semantic structures (eg relations). Some candidate mapping mechanisms have been proposed (including GRDDL) but are not yet widely adopted. The Working Group, in Phase 2, must address at least part of this challenge in specifying a way for rules to make use of XML data.
RDF
The Resource Description Framework (RDF) allows data to be transferred while keeping its semantic structure. As such, its design has considerable overlap with the condition/fact part of a rule language; both are ways of formally expressing propositions. In order to reduce unnecessary re-invention and incompatibilities, the Working Group must use the RDF Semantics as a starting point in the areas of overlap, justifying and agreeing to any variation.
SPARQL
The SPARQL query language allows rich query of RDF datasets and is likely to reach Recommendation status early in the life of this WG. The Working Group should ensure the rule language is compatible with the use of SPARQL as a language for query of the dataset, that the extension mechanism is compatible with use of the SPARQL protocol for fetching additional datasets, and should aim for compatibility with SPARQL's use of XML datatypes, functions and operators.
OWL
The OWL Web Ontology Language allows users to express certain kinds of knowledge and it is suited to certain efficient kinds of reasoning. Some users at the workshop reported that while OWL was useful, they needed additional expressiveness, preferably in the form of rules. It is important that the Working Group maintain compatibility with OWL, allowing knowledge expressed in OWL and in rules to be easily used together.

2. Phase 1: Extensible Core

The Working Group is chartered to address its mission in two distinct phases. Its mission in the first phase is to produce a W3C Recommendation for a very simple and yet useful and extensible format for rules. In the second phase (below), it will produce Recommendations for extensions which address the broader set of use cases important to the participating communities.

2.1. Phase 1 Deliverables

2.2. Phase 1 Scope

2.2.1. Extensibility

The essential task of the Working Group in Phase 1 is to construct an extensible format for rules. The Working Group must try to keep in mind the various features and usage scenarios for rule languages, to be sure the right kind of extensibility is in place. The deliverables should make it clear how out-of-scope features can be addressed by extensions. Some such features discussed at the workshop and probably of wide interest include:

To help ensure extensibility, the Working Group must be responsive to people expressing concerns about how to handle particular kinds of extensions and areas of use. Comments claiming an inability to extend the language should be addressed with text (typically in the WG documents) which explains either: (1) how the desired extension can be performed, or (2) why the intended functionality of the extension is not necessary for the practical interchange of rules.

2.2.2. Conformance

We do not expect rule engines or other rule processing systems (such as editors) to handle even a large set of the features standardized by this Working Group. There are many viable rule languages and rule engines which do not handle even all the Phase 1 features. (In particular, it is common to implement only function-free Horn Logic (Datalog), which is has a finite deductive closure.) The Working Group, therefore, must address conformance carefully. (See QA Framework: Specification Guidelines, 2.1 Specifying Conformance.)

It would be a mistake to encourage vendors to implement all this group's Recommendations at the cost of true end-user utility.

The Working Group may, for instance, require that conformant rule engines behave in particular ways when encountering rulesets using certain kinds of unsupported extensions. Conformant rule processors may be required to refuse to handle rulesets using one kind of extension, while for another kind they may be required to merely issue a warning and produce incomplete results.

2.2.3. Load-and-Query Rule Engine

The core rule engine functionality is to load zero or more rulesets (or datasets) and then answer zero or more queries against the merged contents. This functionality is largely independent of engine implementation strategies. (In particular, it works with both forward chaining and backward chaining.)

The Working Group must not specify an engine control or query interface (language, protocol, or API) as part of the Phase 1 specifications, although it is expected to make use of some interfaces as part of the test suite and in examples.

Many rules engines support external operations, such as requesting more data or invoking procedures when certain rules fire or conclusions are reached. These functions are essential to many applications, but they can be built around load-and-query engines, and their complexities and dependencies on other specifications make them not as well suited to being part of the core.

Some use cases require rules to be used for data transformation rather than query answering, however sufficient coverage of such cases may be achieved by querying the difference between the query answer and the original data.

2.2.4. XML Syntax

The primary normative syntax of the language must be an XML syntax. Users are expected to work with tools or rule languages which are transformed to and from this format.

In order to allow interoperability with RDF and object-oriented systems, the syntax must support named arguments (also called "role" or "slot" names), allowing n-ary facts, rules, and queries to be provided through property/value interfaces.

Note that the natural overlap in expressivity between this language and RDF means this syntax should function as an alternative XML serialization for RDF Graphs (or at least a subset of RDF Graphs). (As noted in the March 2001 charter for RDF Core, it is reasonable to have more than one XML syntax for RDF.) However, this is a side-effect of the approach rather than a deliberate goal and the Working Group should aim to minimize confusion between this and the normative RDF/XML syntax.

2.2.5. Horn Logic

The Phase 1 rule semantics will be essentially Horn Logic, a well-studied sublanguage of First-Order Logic which is the basis of Logic Programming.

Not every rule engine is or should be able to process full Horn Logic rules; they are Turing complete, hence undecidable (the deductive closure of a Horn rule set is infinite in the general case). (See conformance.)

The language must include a way to express facts as well as rules, and also metadata (annotations) about documents, facts, and rules. The WG should consider the benefits of expressing this metadata in RDF, including the ability to query it with SPARQL and analyze it with rules A notion of "ruleset" may also be supported.

2.2.6. Datatype Support

Datatypes need support in the language, including both a syntax for literals and a set of common functions and operators. Most of the design and selection work here has been done as part of XML Schema and XML Query. See Relationships to Other Efforts.

In Phase 1, the format must support literals and common functions and operators for at least: text strings (xsd:string), 32-bit signed integers (xsd:int), unlimited-size decimal numbers (xsd:decimal), Boolean values xsd:boolean), and list structures.

2.3. Phase 1 Major Milestones

First Public Working Draft, Use Cases and Requirements
2006 February
First Public Working Draft, Technical Specification
2006 May
Last Call Working Draft, Technical Specification
2006 October
Recommendation
2007 May

Note on Duration: this charter runs until 30 November 2007, to allow for possible unexpected difficulties. If the Recommendation milestone is reached before then and Working Group membership remains sufficient, it is expected that the Director will extend this charter to cover the second phase.

3. Phase 2: Standard Extensions

Because of the diversity of rule technology and the ongoing innovation in the field, the rule interchange format must be extensible. The Phase 1 mission of the Working Group was to establish the basic extensibility mechanism and produce a usable language. With that core, vendors and advanced users can begin to use the format. For work outside the small set of use cases addressed in Phase 1, however, they will need to find or create non-standard extensions.

During Phase 2, the Working Group is chartered to produce Recommendations for extensions which are strongly motivated by use cases and for which it can articulate a consensus design. These Recommendations may be released separately, grouped into language "levels", as with CSS and DOM, or the Working Group may decide to use a combination of release strategies to maximize effective deployment.

Note that allocation of resources to complete Phase 2 is not guaranteed. (See Note on Duration.)

3.1. Phase 2 Deliverables

As in Phase 1, the deliverables are:

  1. An updated Use Cases and Requirements document
  2. Test Cases
  3. Technical Specification (Recommendation)

The grouping into documents is at the discretion of the Working Group. Several extensions may be grouped into a single Recommendation, published concurrently as a multi-part Recommendation, or extensions be handled as separate Recommendations. Test Cases and motivating use cases may be included along with a Technical Specification, or kept separate. The Working Group is encouraged to organize the documents in a way which avoids user confusion.

3.2. Phase 2 Scope

This list of extensions is a starting point and fall-back list for the Working Group. The Working Group may alter this list by consensus decision when doing so is motivated by use cases and does not significantly endanger the schedule.

3.2.1. Extensions to the Logic

The general directions for extensions in expressive power lie along two roads: monotonic extensions towards full first-order logic (FOL) and non-monotonic extensions based on minimal-model semantics found in Logic Programming (LP) systems. The Working Group will have to navigate this space and find extensions which best serve users.

Classical Negation
Various fragments of OWL
Full First-Order Logic (FOL)
Scoped Negation-As-Failure (SNAF)
Predicate-Complete Data Sources (Local Closed Worlds)

3.2.2. Purely Syntactic Extensions

Lloyd-Topor
These syntactic extensions to Horn logic (in some cases extended with negation), provide convenient syntax with no additional expressive power.
HiLog
The HiLog syntactic transformation gives users the appearance and some of the functionality of a higher-order syntax, via an "apply" (aka "holds") predicate.
Reflection (Reification)
Reasoning about rules and using data about rules (rule metadata) is required in many practical applications.

3.2.3. Datatype and Data Structure Support

Additional datatype and data structure support, both for literal values and functions and operators is in scope, but should be based on XML Schema and XML Query. See Relationships to Other Efforts.

3.2.4. Data Sources

While it is sufficient for the core to say that the rules and data used by a rule engine are expressed in the specified format and loaded by explicit configuration instructions, in practice valuable data is likely to be provided in other formats and via other mechanisms.

XML data
XML documents might be presented as terms stored in facts, perhaps using the XML Query data model, to allow rules to access the structure and indirectly the content of XML documents. See Relationships to Other Efforts.
Program Data
While data structures in running programs can in theory be exposed to a rule engine via an RDF or XML interface, this may not provide acceptable simplicity or performance. This working group should consider whether a standard binding mechanism can be practically provided to match the program-integration functionality of deployed rule engines. The mechanism must be platform and language neutral, but the Working Group may provide platform-specific instances of a general mechanism, as in the Document Object Model (DOM) Recommendations.
Controlled Data Source Access
In some cases, access to data sources is controlled by the structure of rules, as in N3's log:semantics and log:contents predicates, which cause Web retrieval operations during their evaluation.

3.2.5. Actions

Production rule systems are generally built around the concept that when the conditions in a rule's antecedent are met, the rule fires and one or more actions specified in the consequent are performed. In some cases, the actions are just to add more facts to the knowledge base, and the rules are equivalent to Datalog/Horn rules. In other cases, the actions have the effect of running external procedural code, or modifying the knowledge base (which is equivalent to running external code which attempts to modify the knowledge base).

3.2.6. Knowledge Base Access

Rule systems typically provide ways to interact with the knowledge base of the running engine. This is generally hard to do in standard, technology-neutral ways, but in two areas it seems sufficiently motivated as well as feasible.

Updates
The data in "long-running" rule engines can become out-of-date, as the world changes. The fact base may need to be updated either by external notifications, or possibly by conclusions reached during rule processing.
Aggregation
Aggregate functions, like findall in Prolog and SUM and AVG in SQL allow rules to be written which depend on the complete results of querying the knowledge base.

3.3. Phase 2 Major Milestones (Example Plan)

These milestones assume the Working Group decides to call the Phase 1 output "Level 1" and to group the Phase 2 features into two packages, "Level 2" and "Level 3". This is only one possible plan for how the goals of Phase 2 are met, and the Working Group may choose a different one. If the features are not grouped, the milestones will be more complex.

New Working Draft, Use Cases and Requirements (detailing Language Level 2 features)
2007 March
First Public Working Draft, Technical Specification (Language Level 2)
2007 May
Last Call Working Draft, Technical Specification (Language Level 2)
2007 November
Recommendation (Language Level 2)
2008 June
New Public Working Draft, Use Cases and Requirements (detailing Language Level 3 features)
2008 March
First Public Working Draft, Technical Specification (Language Level 3)
2008 May
Last Call Working Draft, Technical Specification (Language Level 3)
2008 November
Recommendation (Language Level 3)
2009 June

4. Relationships to other Efforts

4.1. RuleML

The Rule Markup Initiative has been active since 2000 in developing and promoting rule interchange technology. Versions of the RuleML XML syntax are used in all the rule submissions, and RuleML participants were active at the workshop (which was co-chaired by RuleML co-founder Said Tabet). In short we expect considerable overlap in participation and to draw on technologies and experience developed as part of RuleML.

4.2. JSR 94: Java Rule Engine API

The JSR 94: Java Rule Engine API effort, part of the Java Community Process, provides a standard way to access rule engines from Java, but does not specify a rule language. At the workshop, JSR 94 Lead Daniel Selman reported that the JSR 94 community was supportive of W3C developing a rule interchange format to complement their work.

4.3. OMG Production Rule Representation (PRR)

Following a September 2003 Request for Proposals on Business Rules Representation, the Object Management Group (OMG) began its Production Rule Representation (PRR) effort. This group is chartered to specify a meta-model for the representation of production rules at the Platform-Independent Model (PIM) level of the Model Driven Architecture (MDA). This work was described in a paper and presentation at the workshop.

There is overlap in scope between the groups, and they share the goal of rule interoperability. We expect a useful division of labor as OMG focuses on the standard metamodel definition and modeling of production rules, while this group focusses on a rule interchange format suitable for the Web. This Working Group should avoid barriers to interoperability between these areas, and must appoint a liaison to work with OMG and its PRR core metamodel to maximize the value of these standards efforts in both groups. The Working Group is encouraged to produce a document like the OWL compatibility document showing how these standards work together.

We expect considerable overlap in membership of the groups.

There may be a useful overlap between this Working Group and OMG's Object Constraint Language (OCL), a part of UML 2.0, especially with OCL extensions being considered for PRR.

4.4. OMG Semantics of Business Vocabulary and Business Rules (SBVR)

The Semantics of Business Vocabulary and Business Rules (SBVR) effort is a response to the OMG's June 2003 Business Semantics of Business Rules RFP. This work was described in a paper and presentation at the workshop. We anticipate some overlap in participants to help bring SBVR's user perspective and use cases to the Working Group.

4.5. ISO Common Logic (CL)

The ISO Common Logic effort aims to produce "a language designed for use in the interchange of knowledge among disparate computer systems. It will be a logically comprehensive language with a declarative semantics, and it will provide for the representation of knowledge about knowledge". (From the 2001 Proposal for a New Work Item.) This work is considered an inheritor of Knowledge Interchange Format (KIF).

There are important overlaps in functionality between the goals and designs of Common Logic and this Working Group. Some of the Phase 2 extensions (especially FOL) bring this interchange format close to Common Logic, and any differences (such as sequence variables) should be identified in Use Cases and Requirements. We expect some overlap in participants.

4.6. W3C RDF Data Access Working Group (DAWG)

The RDF Data Access Working Group is producing a query language and protocol for data seen as binary relations (the RDF model). Via role names, rule data should be viewable this way. Thus SPARQL becomes one path by which a rule engine may query for more data, as well as a way in which a rule engine may be queried for deduced answers.

The SPARQL Functions and Operators specification also shows how XQuery 1.0 and XPath 2.0 Functions and Operators can be applied outside of XQuery and XPath. This model may be useful for this Working Group.

4.7. W3C XML Activity

The outputs of several Working Groups currently running in the XML Activity are relevant to this Working Group. In general, they provide default designs which should be directly used by this Working Group. Any deviation from the default must be strongly motivated by the use cases and raised as an issue with the other Working Group (if it is still active), so that misunderstanding can be avoided and the other group learns about a potential problem with its work.

4.8. W3C Submissions

The Working Group should note and borrow as necessary from relevant W3C submissions:

We expect some authors of these submissions will participate in the WG.

5. Participation

Effective participation is expected to consume one workday per week for each Working Group participant and two days per week for editors.

Members may appoint more than two representatives in this Working Group. If this happens, the Chair should establish rules as necessary to maintain balance during discussion and decision-making.

5.1. Joining

Please see the instructions for joining this Working Group.

5.2. Communication Outside of Meetings

The Working Group's technical discussions will occur on the public mailing list [email protected] or a discussion Web site (such as a wiki). All technical documents and decisions will be made available to the public through the Working Group's process. There may be a Member-confidential mailing list used for administrative purposes.

The Working Group will use its home page to record the history of the group, and which will provide access to the archives, meeting minutes, updated schedule of deliverables, membership list, the issues list, and relevant documents and resources. The page will be available to the public and will be maintained by the Chair in collaboration with the W3C staff contact.

5.3. Distributed (Telephone) Meetings

A one to two hour Working Group phone conference will be held every week. When necessary to meet agreed-upon deadlines, phone conferences may be held twice a week.

Meeting records should be made available within three days of each telephone meeting. Meeting records must be made publicly available except for non-technical issues that do not directly affect the output of the Working Group. The Chair decides which issues are not made public.

5.4. Face-to-Face Meetings

Face-to-face meetings will be held every two to four months.

The first face-to-face meeting will be held 8-9 December 2005 in the San Francisco Bay Area, to allow a combined trip for members of PRR.

The second face-to-face meeting is planned for the Technical Plenary.

5.5. Decision Policy

As explained in the Process Document (section 3.3), the Working Group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions the Chair should record a decision (possibly after a formal vote) and any objections, and move on.

When the Chair conducts a formal vote to reach a decision on a substantive technical issue, eligible voters may vote on a proposal one of three ways: for a proposal, against a proposal, or abstain. For the proposal to pass, there must be more votes for the proposal than against. In case of a tie, the Chair will decide the outcome of the proposal.

5.6. Resources

To be successful, we expect the Working Group to have at least 10 participating Member organizations and invited experts for its duration. We also expect a large public review group that will participate in the mailing list discussions.

The W3C Team expects to provide one or more staff contacts dedicating a combined 0.5 FTE to this Working Group.

5.7. Intellectual Property Rights

This Working Group operates under the W3C Patent Policy (5 Feb 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis.


This document was prepared with funding from DARPA, as part of the DAML program, under MIT/AFRL cooperative agreement number F30602-00-2-0593.
Sandro Hawke, W3C (editor)
$Id: charter.html,v 1.7 2005/11/07 03:53:51 sandro Exp $