Copyright © 2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document describes the DOM capabilities needed to support a heterogeneous multimodal environment and the current state of DOM interfaces supporting those capabilities. These DOM interfaces are used between modality components and their host environment in the W3C Multimodal Interaction Framework as proposed by the W3C Multimodal Interaction Activity.
The Multimodal Interaction Framework separates multimodal systems into a set of functional units, including Input and Output components, an Interaction Mananger, Session Components, System and Environment, and Application Functions. In order for those functional components to interact with each other to form an application interpreter, the browser implementation must allow for communication and coordination between those components. This DOM interface identifies the DOM APIs used to communicate and coordinate at the browser implemention level. Multimodal browsers can be stand-alone or distributed systems.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document describes the DOM capabilities required between modality components and their host environment, as a basis for a common component model to support multimodal applications.
This document is a capabilities assessment of DOM1, DOM2, and DOM3. DOM1 edition 2 reached Recommendation status in 2000 with limited support for XML. DOM2 supersedes DOM1 and reached Recommendation status in January 2003 as a series of modules designed to support XML. DOM3 builds upon DOM2. At the time of publication, DOM3 Core, DOM3 Load and Save, and DOM3 Validation have all reached W3C Recommendation status. DOM3 XPath, Views and Formatting, and Events have been published as Notes. Vendor support for DOM levels in mobile and desktop devices varies significantly.
This requirements and capabilities assessment is not a specification of the actual interfaces that would be used in a multimodal context. An actual specification of interfaces must address the following unanswered questions in order to provide an open standard upon which portable multimodal documents could be built: which level of DOM should be the basis for that interface, what common events and properties must be supported by the modality component and host environment.
This document has been produced as part of the W3C Multimodal Interaction Activity, following the procedures set out for the W3C Process. The authors of this document are members of the Multimodal Interaction Working Group (W3C Members only).
Patent disclosures relevant to this specification may be found on the Working Group's patent disclosure page in conformance with W3C policy.
This document is for public review, and comments and discussion are welcomed on the (archived) public mailing list <[email protected]>.
This document defines the capabilities and requirements of a common DOM-based API between modality components and their host environment. The document further describes the current level of support for these capabilities in DOM 1, DOM 2, and DOM 3. This document is not a formal API definition. A formal API definition would need to address a single level of DOM and further specify common interfaces and events.
Modality components perform interface tasks pertinent to their particular interface modality (e.g voice, pen, visual display). Modality components can perform interface tasks ranging from the simple input/output such as pen strokes collection, audio playback to complex transaction like dialogs and card stacks.
The host environment allows multiple modality components to share data such as a semantic result and the confidence value associated with that result. The host environment also coordinates the activities of modality components in a multimodal application; for instance, activation, deactivation, display/prompting.
A multimodal application is comprised of many modality components interacting with the user, all coordinated by the host environment.
A common DOM-based API between modality components and their host environment allows for the creation of new modality components that will work with existing host environments. This common API would also allow for the creation of new host environments for various architectures and devices.
This DOM-based API requirements and capabilities analysis has been performed with the multimodal browser developer in mind. The objective is to enable portable multimodal content. These APIs may be (and in most cases will be) entirely hidden from the application developer. These APIs provide the foundation which allow for higher-level language constructs specific to the modality component or host environment.
These are the general features required of a Host Environment browser implementation to support the integration of a Modality Component. Features relating to DOM support are assumed to be implemented by the DOM Implementation component of the browser.
The intent is that this DOM framework should work in a multi-document scenario. Further specification of that would be required of an interface definition standard
Enables the browser to know when a component (or a proxy to a remote component) is present on the page. In response, the host enables component to register itself and latch to host environment.
Enables the browser to know when a component (or a proxy to a remote component) is removed from the page. In response, the host enables component to unregister itself.
Enables the browser to be aware of DOM errors raised by the component.
DOM Level 1 | DOMException exception |
DOM Level 2 | DOMException
exception |
DOM Level 3 | DOMException
exception |
For components which specify remote resources via URLs, this enables the browser to resolve references which are relative to the current page.
DOM Level 1 | N/A |
DOM Level 2 | HTMLDocument.URL
(DOM2 HTML) |
DOM Level 3 |
Document.DocumentURI |
Enables access to the interfaces described in the Document section below.
DOM Level 1 | Document interface |
DOM Level 2 | Document
interface |
DOM Level 3 | Document
interface |
These are the interfaces which are necessary for component integration into a document supporting the Document interface.
This subsection describes the interfaces necessary for the declaration and instantiation of modality components.
Document interfaces for: identification (find namespace declaration, prefix) and determine parent element tree (for scoped namespaces). Enables the host environment to identify a component by its namespace.
DOM Level 1 | (navigation and identification primitives, no native namespace support) |
DOM Level 2 | DOM 1 + Node.namespaceURI |
DOM Level 3 | DOM 2 + further native namespace interfaces) |
Parent traversal interfaces. Enables the host environment to know in which elements of the document the component is placed, and to implement any location-dependent semantics accordingly. This includes scoped properties which may be inherited (e.g. xml:lang).
DOM Level 1 | Node.parentNode , etc. |
DOM Level 2 | Node.parentNode ,
etc. |
DOM Level 3 | Node.parentNode ,
etc. |
Attribute of type ID
on component node. Enables the
mapping of a given component on the page to the browser's
instantiation of that component.
This subsection describes the interfaces that a modality component may choose to enable to allow the host environment to directly manipulate the component.
The component exposes its public interfaces within the framework of the DOM of the current page. Access to the properties, methods and events of the modality component is then available to the flow control mechanisms of the host environment (script, SMIL, etc.). In general, these interfaces will be modality-specific, and defined for each component according to its function, however, certain common interfaces may be implemented which are standard across different modality components (e.g. Start and Stop methods, result events).
Value setting and rewrite interfaces. Enables the host environment to manipulate properties without modifying document tree structure.
DOM Level 1 | Document.createAttribute() ,
Element.SetAttribute() ,
Node.nodeValue |
DOM Level 2 | Document.createAttribute() ,
Document.createAttributeNS(),
Element.SetAttribute(),
Element.SetAttributeNS() ,
Node.nodeValue |
DOM Level 3 | Document.createAttribute() ,
Document.createAttributeNS() ,
Element.SetAttribute() ,
Element.SetAttributeNS()
Node.nodeValue |
Enables the host environment to manipulate the internal content and/or structure of the component. Examples of this may include manipulating an EMMA result before binding it into the host environment data model, and updating an inline grammar before beginning recognition.
DOM Level 1 | Document.createElement(), Node.replaceChild() |
DOM Level 2 | Document.createElement() ,
Document.createElementNS() ,
Node.replaceChild() |
DOM Level 3 | Document.createElement() ,
Document.createElementNS() ,
Node.replaceChild() |
This subsection describes the interfaces necessary for data binding from the Modality Component into the Host Environment. In the following requirements, the data to be bound called the 'result', and the location to which the data is to be bound is called the 'target'.
This enables the selection of the binding target. Since specification of target node may differ among components (for example, some may use IDs, others XPath, etc.), different means of selecting the target node may be required for different components.
DOM Level 1 | Document.getElementById() (DOM 1 HTML) |
DOM Level 2 | Document.getElementById() ,
XPath query |
DOM Level 3 | Document.getElementById() ,
XPath |
Value setting/rewriting interfaces (no restructuring). Enables the writing of the result data into the target node.
DOM Level 1 | Document.createAttribute() ,
Element.SetAttribute() ,
Node.nodeValue |
DOM Level 2 | Document.createAttribute() ,
Document.createAttributeNS(),
Element.SetAttribute(),
Element.SetAttributeNS() ,
Node.nodeValue |
DOM Level 3 | Document.createAttribute() ,
Document.createAttributeNS() ,
Element.SetAttribute() ,
Element.SetAttributeNS()
Node.nodeValue |
This subsection describes the interfaces necessary for components to access data from the Host Environment, where the component may use host environment data in behavior, for example, where input or output presentation components must act on the basis of host environment data values. In this case the 'target' is the node in the host environment from which data is to be read.
DOM Level 1 | Node.nodeValue |
DOM Level 2 | Node.nodeValue |
DOM Level 3 | Node.nodeValue |
This subsection describes the interfaces necessary for components to call methods on objects in the Host Environment, for the purposes of flow control.
DOM Level 1 | Document.getElementById() (DOM 1 HTML) |
DOM Level 2 | Document.getElementById() |
DOM Level 3 | Document.getElementById(),
DOM Level 3 XPath
specification |
These are the interfaces necessary for events to be passed from the modality component to the host environment.
Enables modality component to take advantage of bubbling, event start/stop mechanisms
DOM Level 1 | (not specified) |
DOM Level 2 | ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism |
DOM Level 3 | ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism, adds stopPropagation() and stopImmediatePropagation() |
Enables modality component to generate events.
DOM Level 1 | (not specified) |
DOM Level 2 |
DocumentEvent interface |
DOM Level 3 |
DocumentEvent interface |
Enables modality component to register handlers
DOM Level 1 | (not specified) |
DOM Level 2 |
Node.EventListener interface |
DOM Level 3 |
Node.EventListener , namespace and event category support
through
addEventListenerNS() |
The DOM interfaces described in this document specify the low-level component interaction layer. Application authors will develop multimodal applications in markup or scripting languages which are built upon these DOM primitives. These markup or scripting languages provide at least three advantages to the application author: simplified programming constructs, for instance time-sequenced interactions could be specified through markup that is then translated by the browser into these DOM manipulations; improved sandboxing, for example preventing one component from modifying another; constraint preservation such as enforcing the data type of a particular field. The examples below possible application-layer constructs built upon DOM interfaces the that could be used to develop multimodal applications.
XML Events 1.0 defines an XML module for authoring event bindings conformant to the DOM2 event propagation. It is a W3C Recommendation that is designed for use with XHTML and other XML-based markup languages and provides the XML author with the same level of access to attaching events and event handlers as provided by the interfaces enumerated in section 5.
The XForms 1.0 REC defines an XForms model that holds an XML instance where values collected from the user are stored and retrieved. For environments that include the XForms module, modality components can get and set values from/to this instance. Setting such values will automatically synchronize these values across different modalities. Values in the XForms instance can be manipulated using standard XML DOM calls after a reference to the XForms instance data has been first retrieved via function getInstanceDocument() ---defined in 7.3.1 of the XForms 1.0 specification.
Synchronized Multimedia Integration Language (SMIL 2.0) is an XML-based language that allows authors to write interactive multimedia presentations. Through its timing and synchronization support, SMIL 2.0 enables the description of the temporal behavior of a multimedia or multimodal application. The DOM events and methods of a given component can be mapped to timing behaviors defined in SMIL's timing and synchronization module (beginning, duration, ending, etc.), and the application author may thereby control the activation of multimodal input and output components through SMIL markup.
Many host environments support scripting for application authors, for example XHTML 1.1 Scripting Module, SVG 1.1 Scripting Module. Scripting modules typically allow access to many public interfaces of the host document and its component nodes, enabling direct programmatic control of the properties, methods and events of multimodal components through languages such as ECMAScript. Scripting hosts may provide direct language bindings of the interfaces specified in this document, and/or abstractions and programming conveniences on top of these interfaces.
The following table is an XSLT transformation of the sections and DOM interfaces described above. This is purely for convenience of reference.
Function | DOM Level 1 | DOM Level 2 | DOM Level 3 |
---|---|---|---|
2.3 General DOM Errors | DOMException exception |
DOMException
exception |
DOMException
exception |
2.4 URL of the Current Page | N/A | HTMLDocument.URL
(DOM2 HTML) |
Document.DocumentURI |
2.5 Access to Current Page as Document Object | Document interface |
Document
interface |
Document
interface |
3.1 Component Declaration and Instantiation | (navigation and identification primitives, no native namespace support) | DOM 1 + Node.namespaceURI | DOM 2 + further native namespace interfaces) |
3.1 Component Declaration and Instantiation | Node.parentNode , etc. |
Node.parentNode ,
etc. |
Node.parentNode ,
etc. |
3.2 Access to Component | Document.createAttribute() ,
Element.SetAttribute() ,
Node.nodeValue |
Document.createAttribute() ,
Document.createAttributeNS(),
Element.SetAttribute(),
Element.SetAttributeNS() ,
Node.nodeValue |
Document.createAttribute() ,
Document.createAttributeNS() ,
Element.SetAttribute() ,
Element.SetAttributeNS()
Node.nodeValue |
3.2 Access to Component | Document.createElement(), Node.replaceChild() |
Document.createElement() ,
Document.createElementNS() ,
Node.replaceChild() |
Document.createElement() ,
Document.createElementNS() ,
Node.replaceChild() |
3.3 Data Binding into Host Environment | Document.getElementById() (DOM 1 HTML) |
Document.getElementById() ,
XPath query |
Document.getElementById() ,
XPath |
3.3 Data Binding into Host Environment | Document.createAttribute() ,
Element.SetAttribute() ,
Node.nodeValue |
Document.createAttribute() ,
Document.createAttributeNS(),
Element.SetAttribute(),
Element.SetAttributeNS() ,
Node.nodeValue |
Document.createAttribute() ,
Document.createAttributeNS() ,
Element.SetAttribute() ,
Element.SetAttributeNS()
Node.nodeValue |
3.3 Data Binding into Host Environment | Node.nodeValue |
Node.nodeValue |
Node.nodeValue |
4.1 Navigation to Target Node | Document.getElementById() (DOM 1 HTML) |
Document.getElementById() |
Document.getElementById(),
DOM
Level 3 XPath specification |
5.1 Event Propagation | (not specified) | ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism | ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism, adds stopPropagation() and stopImmediatePropagation() |
5.2 Event Creation | (not specified) |
DocumentEvent interface |
DocumentEvent interface |
5.3 Event Handler Registration | (not specified) |
Node.EventListener interface |
Node.EventListener , namespace and event category support
through
addEventListenerNS() |