This Working Group is chartered to produce a core rule language plus extensions which together allow rules to be translated between rule languages and thus transferred between rule systems. The Working Group will have to balance the needs of a diverse community — including Business Rules and Semantic Web users — specifying extensions for which it can articulate a consensus design and which are sufficiently motivated by use cases.
This work is divided into two phases, and this charter only provides resources for the first phase (up to two years). Upon completion of the first phase, the Director may extend this charter to cover the second phase. (See Note on Duration.)
Contents:
The Working Group is to specify a format for rules, so they can be used across diverse systems. This format (or language) will function as an interlingua into which established and new rule languages can be mapped, allowing rules written for one application to be published, shared, and re-used in other applications and other rule engines.
Because of the great variety in rule languages and rule engine technologies, this common format will take the form of a core language to be used along with a set of standard and non-standard extensions. The Working Group is chartered to first establish the extensible core and possibly a set of extensions, and then (in Phase 2) to begin to specify additional extensions based on user requirements. These extensions need not all be combinable into a single unified language.
This mission is part of W3C's larger goal of enabling the sharing of information in forms suited to machine processing, as seen in several application areas presented at the 2005 W3C Workshop on Rule Languages for Interoperability:
To help motivate and clarify the scope of this working group, here are three cornerstone scenarios, each illustrating a kind of application which should be supported by the rule exchange infrastructure provided by this work.
Jackson is trying to find someone: he needs at least one more client before the end of the quarter. He has access over the web to dozens of databases, some public, some licensed, and some maintained within his company. Together, they contain millions of potential candidates, but how is he going to narrow the field to the five or ten leads he should seriously pursue?
He thinks for a minute, then constructs a new query. He clicks on interesting properties, sees their values, and locks in the ones he thinks will act as useful filters. After a few minutes he gets frustrated, because the same concepts seem to have different names in different databases. Worse, the same idea is sometimes expressed in different structures; one database follows the the model of having "name of assistant" and "phone number of assistant" properties, while another simply has an "assistant" property, which links to another person. Trying to handle structural variations like this in his query is becoming impossible.
Fortunately, Jackson's system supports a rule language. The query construction interface helps him construct mapping rules between different constructs which seem equivalent to him, letting him infer new information that is customized to his needs, so he can query over a virtual unified database with a structure that seems to him to be simple and straightforward.
In fact, these rules were already being used; the data views Jackson saw were in many cases constructed by rules other people had written. His own rules will be available to his department (because he stored them in department workspace), allowing his co-workers to use the unified view he finds so useful.
Bob goes to his new physician, Dr. Rosen, complaining of a painful cough and some difficulty breathing. The diagnosis of pneumonia is straightforward, and Dr. Rosen prepares to prescribe erythromycin. First, he asks Bob if he is taking any medications. Unfortunately, Bob is not entirely forthcoming: he says no, even though he takes pimozide to help manage Tourette's disorder. The omission seems harmless enough, and Bob is uncomfortable with people knowing about this difficult aspect of his medical history.
Fortunately, Bob uses the same pharmacy for both prescriptions, and his pharmacy checks all prescriptions against a merged, multi-source rule base. This rule base includes the fact that erythromycin is a macrolide antibiotic (coming from the erythromycin vendor) and an encoding of the 1996 FDA bulletin that pimozide is contraindicated with macrolides. When the pharmacist enters the prescription, he is informed of the potentially dangerous drug interaction. He talks to Bob, and with Bob's permission contacts Dr. Rosen to plan an alternative therapy.
The same technology could be made available to doctors, to double check their own knowledge and available references, and to consumers who want to take a greater role in understand their own health care. The key is the ability to efficiently merge rules from multiple sources because we have an interchange language.
Cory is shopping for a home equity loan. A web search finds a site (loans.example.com) of which Cory has heard and which offers to get him three free quotes. He enters the required information. The application form uses rules that indicated that since his location is in California, he is required by state law to specify whether his application is for home improvement. This "intelligent form" means that he is less likely to be have his application returned for additional information. His application is then dispatched to three lenders. The lenders in turn each add his application to their applicant database where it is subject to matching by their rules.
One lender's system determines a suitable rate and sends Cory an e-mail and paper-mail reply immediately. The second flags the application for review by a loan officer who looks briefly at the data before authorizing the automated offer process to continue. At the third lender, Cory is automatically classified as a highly desirable customer, and a loan officer is flagged to call Cory and personally move the process forward.
The rules in each lender's rule base are in fact based on a combination of their own business rules, rules of their aftermarket loan trading partners, and rules encoding government regulations. Again, this becomes much more practical when based on a common interchange language.
In each case, conventional rules technology is enhanced not only by the usual economies of standardization, but also by the ability to exchange and merge rules from different sources. Particularly in the first scenario, we see the kind of ad hoc data fusion which is the hallmark of the Web, finally being done by machine.
It is important for the Working Group to reuse and build on existing technologies and standards, even when it makes the design job harder. The greatest challenge in establishing a rule language standard may be the multitude of existing approaches in the marketplace. Interoperation with the most widely deployed technologies will be crucial for obtaining the desired standardization effect.
The Working Group is chartered to address its mission in two distinct phases. Its mission in the first phase is to produce a W3C Recommendation for a very simple and yet useful and extensible format for rules. In the second phase (below), it will produce Recommendations for extensions which address the broader set of use cases important to the participating communities.
The essential task of the Working Group in Phase 1 is to construct an extensible format for rules. The Working Group must try to keep in mind the various features and usage scenarios for rule languages, to be sure the right kind of extensibility is in place. The deliverables should make it clear how out-of-scope features can be addressed by extensions. Some such features discussed at the workshop and probably of wide interest include:
To help ensure extensibility, the Working Group must be responsive to people expressing concerns about how to handle particular kinds of extensions and areas of use. Comments claiming an inability to extend the language should be addressed with text (typically in the WG documents) which explains either: (1) how the desired extension can be performed, or (2) why the intended functionality of the extension is not necessary for the practical interchange of rules.
We do not expect rule engines or other rule processing systems (such as editors) to handle even a large set of the features standardized by this Working Group. There are many viable rule languages and rule engines which do not handle even all the Phase 1 features. (In particular, it is common to implement only function-free Horn Logic (Datalog), which is has a finite deductive closure.) The Working Group, therefore, must address conformance carefully. (See QA Framework: Specification Guidelines, 2.1 Specifying Conformance.)
It would be a mistake to encourage vendors to implement all this group's Recommendations at the cost of true end-user utility.
The Working Group may, for instance, require that conformant rule engines behave in particular ways when encountering rulesets using certain kinds of unsupported extensions. Conformant rule processors may be required to refuse to handle rulesets using one kind of extension, while for another kind they may be required to merely issue a warning and produce incomplete results.
The core rule engine functionality is to load zero or more rulesets (or datasets) and then answer zero or more queries against the merged contents. This functionality is largely independent of engine implementation strategies. (In particular, it works with both forward chaining and backward chaining.)
The Working Group must not specify an engine control or query interface (language, protocol, or API) as part of the Phase 1 specifications, although it is expected to make use of some interfaces as part of the test suite and in examples.
Many rules engines support external operations, such as requesting more data or invoking procedures when certain rules fire or conclusions are reached. These functions are essential to many applications, but they can be built around load-and-query engines, and their complexities and dependencies on other specifications make them not as well suited to being part of the core.
Some use cases require rules to be used for data transformation rather than query answering, however sufficient coverage of such cases may be achieved by querying the difference between the query answer and the original data.
The primary normative syntax of the language must be an XML syntax. Users are expected to work with tools or rule languages which are transformed to and from this format.
In order to allow interoperability with RDF and object-oriented systems, the syntax must support named arguments (also called "role" or "slot" names), allowing n-ary facts, rules, and queries to be provided through property/value interfaces.
Note that the natural overlap in expressivity between this language and RDF means this syntax should function as an alternative XML serialization for RDF Graphs (or at least a subset of RDF Graphs). (As noted in the March 2001 charter for RDF Core, it is reasonable to have more than one XML syntax for RDF.) However, this is a side-effect of the approach rather than a deliberate goal and the Working Group should aim to minimize confusion between this and the normative RDF/XML syntax.
The Phase 1 rule semantics will be essentially Horn Logic, a well-studied sublanguage of First-Order Logic which is the basis of Logic Programming.
Not every rule engine is or should be able to process full Horn Logic rules; they are Turing complete, hence undecidable (the deductive closure of a Horn rule set is infinite in the general case). (See conformance.)
The language must include a way to express facts as well as rules, and also metadata (annotations) about documents, facts, and rules. The WG should consider the benefits of expressing this metadata in RDF, including the ability to query it with SPARQL and analyze it with rules A notion of "ruleset" may also be supported.
Datatypes need support in the language, including both a syntax for literals and a set of common functions and operators. Most of the design and selection work here has been done as part of XML Schema and XML Query. See Relationships to Other Efforts.
In Phase 1, the format must support literals and common functions and operators for at least: text strings (xsd:string), 32-bit signed integers (xsd:int), unlimited-size decimal numbers (xsd:decimal), Boolean values xsd:boolean), and list structures.
Note on Duration: this charter runs until 30 November 2007, to allow for possible unexpected difficulties. If the Recommendation milestone is reached before then and Working Group membership remains sufficient, it is expected that the Director will extend this charter to cover the second phase.
Because of the diversity of rule technology and the ongoing innovation in the field, the rule interchange format must be extensible. The Phase 1 mission of the Working Group was to establish the basic extensibility mechanism and produce a usable language. With that core, vendors and advanced users can begin to use the format. For work outside the small set of use cases addressed in Phase 1, however, they will need to find or create non-standard extensions.
During Phase 2, the Working Group is chartered to produce Recommendations for extensions which are strongly motivated by use cases and for which it can articulate a consensus design. These Recommendations may be released separately, grouped into language "levels", as with CSS and DOM, or the Working Group may decide to use a combination of release strategies to maximize effective deployment.
Note that allocation of resources to complete Phase 2 is not guaranteed. (See Note on Duration.)
As in Phase 1, the deliverables are:
The grouping into documents is at the discretion of the Working Group. Several extensions may be grouped into a single Recommendation, published concurrently as a multi-part Recommendation, or extensions be handled as separate Recommendations. Test Cases and motivating use cases may be included along with a Technical Specification, or kept separate. The Working Group is encouraged to organize the documents in a way which avoids user confusion.
This list of extensions is a starting point and fall-back list for the Working Group. The Working Group may alter this list by consensus decision when doing so is motivated by use cases and does not significantly endanger the schedule.
The general directions for extensions in expressive power lie along two roads: monotonic extensions towards full first-order logic (FOL) and non-monotonic extensions based on minimal-model semantics found in Logic Programming (LP) systems. The Working Group will have to navigate this space and find extensions which best serve users.
Additional datatype and data structure support, both for literal values and functions and operators is in scope, but should be based on XML Schema and XML Query. See Relationships to Other Efforts.
While it is sufficient for the core to say that the rules and data used by a rule engine are expressed in the specified format and loaded by explicit configuration instructions, in practice valuable data is likely to be provided in other formats and via other mechanisms.
Production rule systems are generally built around the concept that when the conditions in a rule's antecedent are met, the rule fires and one or more actions specified in the consequent are performed. In some cases, the actions are just to add more facts to the knowledge base, and the rules are equivalent to Datalog/Horn rules. In other cases, the actions have the effect of running external procedural code, or modifying the knowledge base (which is equivalent to running external code which attempts to modify the knowledge base).
Rule systems typically provide ways to interact with the knowledge base of the running engine. This is generally hard to do in standard, technology-neutral ways, but in two areas it seems sufficiently motivated as well as feasible.
These milestones assume the Working Group decides to call the Phase 1 output "Level 1" and to group the Phase 2 features into two packages, "Level 2" and "Level 3". This is only one possible plan for how the goals of Phase 2 are met, and the Working Group may choose a different one. If the features are not grouped, the milestones will be more complex.
The Rule Markup Initiative has been active since 2000 in developing and promoting rule interchange technology. Versions of the RuleML XML syntax are used in all the rule submissions, and RuleML participants were active at the workshop (which was co-chaired by RuleML co-founder Said Tabet). In short we expect considerable overlap in participation and to draw on technologies and experience developed as part of RuleML.
The JSR 94: Java Rule Engine API effort, part of the Java Community Process, provides a standard way to access rule engines from Java, but does not specify a rule language. At the workshop, JSR 94 Lead Daniel Selman reported that the JSR 94 community was supportive of W3C developing a rule interchange format to complement their work.
Following a September 2003 Request for Proposals on Business Rules Representation, the Object Management Group (OMG) began its Production Rule Representation (PRR) effort. This group is chartered to specify a meta-model for the representation of production rules at the Platform-Independent Model (PIM) level of the Model Driven Architecture (MDA). This work was described in a paper and presentation at the workshop.
There is overlap in scope between the groups, and they share the goal of rule interoperability. We expect a useful division of labor as OMG focuses on the standard metamodel definition and modeling of production rules, while this group focusses on a rule interchange format suitable for the Web. This Working Group should avoid barriers to interoperability between these areas, and must appoint a liaison to work with OMG and its PRR core metamodel to maximize the value of these standards efforts in both groups. The Working Group is encouraged to produce a document like the OWL compatibility document showing how these standards work together.
We expect considerable overlap in membership of the groups.
There may be a useful overlap between this Working Group and OMG's Object Constraint Language (OCL), a part of UML 2.0, especially with OCL extensions being considered for PRR.
The Semantics of Business Vocabulary and Business Rules (SBVR) effort is a response to the OMG's June 2003 Business Semantics of Business Rules RFP. This work was described in a paper and presentation at the workshop. We anticipate some overlap in participants to help bring SBVR's user perspective and use cases to the Working Group.
The ISO Common Logic effort aims to produce "a language designed for use in the interchange of knowledge among disparate computer systems. It will be a logically comprehensive language with a declarative semantics, and it will provide for the representation of knowledge about knowledge". (From the 2001 Proposal for a New Work Item.) This work is considered an inheritor of Knowledge Interchange Format (KIF).
There are important overlaps in functionality between the goals and designs of Common Logic and this Working Group. Some of the Phase 2 extensions (especially FOL) bring this interchange format close to Common Logic, and any differences (such as sequence variables) should be identified in Use Cases and Requirements. We expect some overlap in participants.
The RDF Data Access Working Group is producing a query language and protocol for data seen as binary relations (the RDF model). Via role names, rule data should be viewable this way. Thus SPARQL becomes one path by which a rule engine may query for more data, as well as a way in which a rule engine may be queried for deduced answers.
The SPARQL Functions and Operators specification also shows how XQuery 1.0 and XPath 2.0 Functions and Operators can be applied outside of XQuery and XPath. This model may be useful for this Working Group.
The outputs of several Working Groups currently running in the XML Activity are relevant to this Working Group. In general, they provide default designs which should be directly used by this Working Group. Any deviation from the default must be strongly motivated by the use cases and raised as an issue with the other Working Group (if it is still active), so that misunderstanding can be avoided and the other group learns about a potential problem with its work.
The XML Schema Working Group provides a way to define XML grammars, which may be useful in defining the rule format syntax, and a model for data types and data structures.
A joint task force of the XML Query Working Group and the XSL Working Group is producing the XQuery 1.0 and XPath 2.0 Functions and Operators, which provides a default list of datatype and data structure operations.
The XML Query Working Group provides a model for logical conditions (query pattern matches) about XML data, a Phase 2 feature
The Working Group should note and borrow as necessary from relevant W3C submissions:
We expect some authors of these submissions will participate in the WG.
Effective participation is expected to consume one workday per week for each Working Group participant and two days per week for editors.
Members may appoint more than two representatives in this Working Group. If this happens, the Chair should establish rules as necessary to maintain balance during discussion and decision-making.
Please see the instructions for joining this Working Group.
The Working Group's technical discussions will occur on the public mailing list [email protected] or a discussion Web site (such as a wiki). All technical documents and decisions will be made available to the public through the Working Group's process. There may be a Member-confidential mailing list used for administrative purposes.
The Working Group will use its home page to record the history of the group, and which will provide access to the archives, meeting minutes, updated schedule of deliverables, membership list, the issues list, and relevant documents and resources. The page will be available to the public and will be maintained by the Chair in collaboration with the W3C staff contact.
A one to two hour Working Group phone conference will be held every week. When necessary to meet agreed-upon deadlines, phone conferences may be held twice a week.
Meeting records should be made available within three days of each telephone meeting. Meeting records must be made publicly available except for non-technical issues that do not directly affect the output of the Working Group. The Chair decides which issues are not made public.
Face-to-face meetings will be held every two to four months.
The first face-to-face meeting will be held 8-9 December 2005 in the San Francisco Bay Area, to allow a combined trip for members of PRR.
The second face-to-face meeting is planned for the Technical Plenary.
As explained in the Process Document (section 3.3), the Working Group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions the Chair should record a decision (possibly after a formal vote) and any objections, and move on.
When the Chair conducts a formal vote to reach a decision on a substantive technical issue, eligible voters may vote on a proposal one of three ways: for a proposal, against a proposal, or abstain. For the proposal to pass, there must be more votes for the proposal than against. In case of a tie, the Chair will decide the outcome of the proposal.
To be successful, we expect the Working Group to have at least 10 participating Member organizations and invited experts for its duration. We also expect a large public review group that will participate in the mailing list discussions.
The W3C Team expects to provide one or more staff contacts dedicating a combined 0.5 FTE to this Working Group.
This Working Group operates under the W3C Patent Policy (5 Feb 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis.