W3C

Modality Component to Host Environment DOM Requirements and Capabilities Assessment

W3C Working Group Note 10 May 2004

This version:
http://www.w3.org/TR/2004/NOTE-modality-interface-20040510/
Latest version:
http://www.w3.org/TR/modality-interface/
Editor:
Brad Porter, Tellme Networks (editor)
Contributors:
Jonny Axelsson, Opera
Stephen Potter, Microsoft
TV Raman, IBM
Dave Raggett, W3C/Canon

Abstract

This document describes the DOM capabilities needed to support a heterogeneous multimodal environment and the current state of DOM interfaces supporting those capabilities. These DOM interfaces are used between modality components and their host environment in the W3C Multimodal Interaction Framework as proposed by the W3C Multimodal Interaction Activity.

The Multimodal Interaction Framework separates multimodal systems into a set of functional units, including Input and Output components, an Interaction Mananger, Session Components, System and Environment, and Application Functions. In order for those functional components to interact with each other to form an application interpreter, the browser implementation must allow for communication and coordination between those components. This DOM interface identifies the DOM APIs used to communicate and coordinate at the browser implemention level. Multimodal browsers can be stand-alone or distributed systems.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document describes the DOM capabilities required between modality components and their host environment, as a basis for a common component model to support multimodal applications.

This document is a capabilities assessment of DOM1, DOM2, and DOM3. DOM1 edition 2 reached Recommendation status in 2000 with limited support for XML. DOM2 supersedes DOM1 and reached Recommendation status in January 2003 as a series of modules designed to support XML. DOM3 builds upon DOM2. At the time of publication, DOM3 Core, DOM3 Load and Save, and DOM3 Validation have all reached W3C Recommendation status. DOM3 XPath, Views and Formatting, and Events have been published as Notes. Vendor support for DOM levels in mobile and desktop devices varies significantly.

This requirements and capabilities assessment is not a specification of the actual interfaces that would be used in a multimodal context. An actual specification of interfaces must address the following unanswered questions in order to provide an open standard upon which portable multimodal documents could be built: which level of DOM should be the basis for that interface, what common events and properties must be supported by the modality component and host environment.

This document has been produced as part of the W3C Multimodal Interaction Activity, following the procedures set out for the W3C Process. The authors of this document are members of the Multimodal Interaction Working Group (W3C Members only).

Patent disclosures relevant to this specification may be found on the Working Group's patent disclosure page in conformance with W3C policy.

This document is for public review, and comments and discussion are welcomed on the (archived) public mailing list <[email protected]>.


1 Overview

This document defines the capabilities and requirements of a common DOM-based API between modality components and their host environment. The document further describes the current level of support for these capabilities in DOM 1, DOM 2, and DOM 3. This document is not a formal API definition. A formal API definition would need to address a single level of DOM and further specify common interfaces and events.

Modality components perform interface tasks pertinent to their particular interface modality (e.g voice, pen, visual display). Modality components can perform interface tasks ranging from the simple input/output such as pen strokes collection, audio playback to complex transaction like dialogs and card stacks.

The host environment allows multiple modality components to share data such as a semantic result and the confidence value associated with that result. The host environment also coordinates the activities of modality components in a multimodal application; for instance, activation, deactivation, display/prompting.

A multimodal application is comprised of many modality components interacting with the user, all coordinated by the host environment.

A common DOM-based API between modality components and their host environment allows for the creation of new modality components that will work with existing host environments. This common API would also allow for the creation of new host environments for various architectures and devices.

This DOM-based API requirements and capabilities analysis has been performed with the multimodal browser developer in mind. The objective is to enable portable multimodal content. These APIs may be (and in most cases will be) entirely hidden from the application developer. These APIs provide the foundation which allow for higher-level language constructs specific to the modality component or host environment.

2 General DOM Implementation Features

These are the general features required of a Host Environment browser implementation to support the integration of a Modality Component. Features relating to DOM support are assumed to be implemented by the DOM Implementation component of the browser.

The intent is that this DOM framework should work in a multi-document scenario. Further specification of that would be required of an interface definition standard

2.1 Component Parsed/Loaded Notification

Enables the browser to know when a component (or a proxy to a remote component) is present on the page. In response, the host enables component to register itself and latch to host environment.

2.2 Component Unloaded Notification

Enables the browser to know when a component (or a proxy to a remote component) is removed from the page. In response, the host enables component to unregister itself.

2.3 General DOM Errors

Enables the browser to be aware of DOM errors raised by the component.

DOM Level 1 DOMException exception
DOM Level 2 DOMException exception
DOM Level 3 DOMException exception

2.4 URL of the Current Page

For components which specify remote resources via URLs, this enables the browser to resolve references which are relative to the current page.

DOM Level 1 N/A
DOM Level 2 HTMLDocument.URL (DOM2 HTML)
DOM Level 3 Document.DocumentURI

2.5 Access to Current Page as Document Object

Enables access to the interfaces described in the Document section below.

DOM Level 1 Document interface
DOM Level 2 Document interface
DOM Level 3 Document interface

3 Document Interfaces

These are the interfaces which are necessary for component integration into a document supporting the Document interface.

3.1 Component Declaration and Instantiation

This subsection describes the interfaces necessary for the declaration and instantiation of modality components.

3.1.1 Namespace Management

Document interfaces for: identification (find namespace declaration, prefix) and determine parent element tree (for scoped namespaces). Enables the host environment to identify a component by its namespace.

DOM Level 1 (navigation and identification primitives, no native namespace support)
DOM Level 2 DOM 1 + Node.namespaceURI
DOM Level 3 DOM 2 + further native namespace interfaces)

3.1.2 Location of Component in Document

Parent traversal interfaces. Enables the host environment to know in which elements of the document the component is placed, and to implement any location-dependent semantics accordingly. This includes scoped properties which may be inherited (e.g. xml:lang).

DOM Level 1 Node.parentNode, etc.
DOM Level 2 Node.parentNode, etc.
DOM Level 3 Node.parentNode, etc.

3.1.3 Component Self-Identification

Attribute of type ID on component node. Enables the mapping of a given component on the page to the browser's instantiation of that component.

3.2 Access to Component

This subsection describes the interfaces that a modality component may choose to enable to allow the host environment to directly manipulate the component.

3.2.1 Read Interfaces

The component exposes its public interfaces within the framework of the DOM of the current page. Access to the properties, methods and events of the modality component is then available to the flow control mechanisms of the host environment (script, SMIL, etc.). In general, these interfaces will be modality-specific, and defined for each component according to its function, however, certain common interfaces may be implemented which are standard across different modality components (e.g. Start and Stop methods, result events).

3.2.2 Property Write Interfaces

Value setting and rewrite interfaces. Enables the host environment to manipulate properties without modifying document tree structure.

DOM Level 1 Document.createAttribute(), Element.SetAttribute(), Node.nodeValue
DOM Level 2 Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS(), Node.nodeValue
DOM Level 3 Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS() Node.nodeValue

3.2.3 XML Structure Modification Write Interfaces

Enables the host environment to manipulate the internal content and/or structure of the component. Examples of this may include manipulating an EMMA result before binding it into the host environment data model, and updating an inline grammar before beginning recognition.

DOM Level 1 Document.createElement(), Node.replaceChild()
DOM Level 2 Document.createElement(), Document.createElementNS(), Node.replaceChild()
DOM Level 3 Document.createElement(), Document.createElementNS(), Node.replaceChild()

3.3 Data Binding into Host Environment

This subsection describes the interfaces necessary for data binding from the Modality Component into the Host Environment. In the following requirements, the data to be bound called the 'result', and the location to which the data is to be bound is called the 'target'.

3.3.1 Navigation to Target Node

This enables the selection of the binding target. Since specification of target node may differ among components (for example, some may use IDs, others XPath, etc.), different means of selecting the target node may be required for different components.

DOM Level 1 Document.getElementById() (DOM 1 HTML)
DOM Level 2 Document.getElementById(), XPath query
DOM Level 3 Document.getElementById(), XPath

3.3.2 Write Access to Target Node

Value setting/rewriting interfaces (no restructuring). Enables the writing of the result data into the target node.

DOM Level 1 Document.createAttribute(), Element.SetAttribute(), Node.nodeValue
DOM Level 2 Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS(), Node.nodeValue
DOM Level 3 Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS() Node.nodeValue

3.3.3 Reading Values from Host Environment Data

This subsection describes the interfaces necessary for components to access data from the Host Environment, where the component may use host environment data in behavior, for example, where input or output presentation components must act on the basis of host environment data values. In this case the 'target' is the node in the host environment from which data is to be read.

DOM Level 1 Node.nodeValue
DOM Level 2 Node.nodeValue
DOM Level 3 Node.nodeValue

4 Method Calling on Host Environment Flow Control

This subsection describes the interfaces necessary for components to call methods on objects in the Host Environment, for the purposes of flow control.

4.1 Navigation to Target Node

DOM Level 1 Document.getElementById() (DOM 1 HTML)
DOM Level 2 Document.getElementById()
DOM Level 3 Document.getElementById(), DOM Level 3 XPath specification

5 Events

These are the interfaces necessary for events to be passed from the modality component to the host environment.

5.1 Event Propagation

Enables modality component to take advantage of bubbling, event start/stop mechanisms

DOM Level 1 (not specified)
DOM Level 2 ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism
DOM Level 3 ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism, adds stopPropagation() and stopImmediatePropagation()

5.2 Event Creation

Enables modality component to generate events.

DOM Level 1 (not specified)
DOM Level 2 DocumentEvent interface
DOM Level 3 DocumentEvent interface

5.3 Event Handler Registration

Enables modality component to register handlers

DOM Level 1 (not specified)
DOM Level 2 Node.EventListener interface
DOM Level 3 Node.EventListener, namespace and event category support through addEventListenerNS()

6 Markup-level Abstractions

The DOM interfaces described in this document specify the low-level component interaction layer. Application authors will develop multimodal applications in markup or scripting languages which are built upon these DOM primitives. These markup or scripting languages provide at least three advantages to the application author: simplified programming constructs, for instance time-sequenced interactions could be specified through markup that is then translated by the browser into these DOM manipulations; improved sandboxing, for example preventing one component from modifying another; constraint preservation such as enforcing the data type of a particular field. The examples below possible application-layer constructs built upon DOM interfaces the that could be used to develop multimodal applications.

6.1 XML Events

XML Events 1.0 defines an XML module for authoring event bindings conformant to the DOM2 event propagation. It is a W3C Recommendation that is designed for use with XHTML and other XML-based markup languages and provides the XML author with the same level of access to attaching events and event handlers as provided by the interfaces enumerated in section 5.

6.2 XForms

The XForms 1.0 REC defines an XForms model that holds an XML instance where values collected from the user are stored and retrieved. For environments that include the XForms module, modality components can get and set values from/to this instance. Setting such values will automatically synchronize these values across different modalities. Values in the XForms instance can be manipulated using standard XML DOM calls after a reference to the XForms instance data has been first retrieved via function getInstanceDocument() ---defined in 7.3.1 of the XForms 1.0 specification.

6.3 SMIL 2.0

Synchronized Multimedia Integration Language (SMIL 2.0) is an XML-based language that allows authors to write interactive multimedia presentations. Through its timing and synchronization support, SMIL 2.0 enables the description of the temporal behavior of a multimedia or multimodal application. The DOM events and methods of a given component can be mapped to timing behaviors defined in SMIL's timing and synchronization module (beginning, duration, ending, etc.), and the application author may thereby control the activation of multimodal input and output components through SMIL markup.

6.4 Scripting modules

Many host environments support scripting for application authors, for example XHTML 1.1 Scripting Module, SVG 1.1 Scripting Module. Scripting modules typically allow access to many public interfaces of the host document and its component nodes, enabling direct programmatic control of the properties, methods and events of multimodal components through languages such as ECMAScript. Scripting hosts may provide direct language bindings of the interfaces specified in this document, and/or abstractions and programming conveniences on top of these interfaces.

Appendix A

The following table is an XSLT transformation of the sections and DOM interfaces described above. This is purely for convenience of reference.

Function DOM Level 1 DOM Level 2 DOM Level 3
2.3 General DOM Errors DOMException exception DOMException exception DOMException exception
2.4 URL of the Current Page N/A HTMLDocument.URL (DOM2 HTML) Document.DocumentURI
2.5 Access to Current Page as Document Object Document interface Document interface Document interface
3.1 Component Declaration and Instantiation (navigation and identification primitives, no native namespace support) DOM 1 + Node.namespaceURI DOM 2 + further native namespace interfaces)
3.1 Component Declaration and Instantiation Node.parentNode, etc. Node.parentNode, etc. Node.parentNode, etc.
3.2 Access to Component Document.createAttribute(), Element.SetAttribute(), Node.nodeValue Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS(), Node.nodeValue Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS() Node.nodeValue
3.2 Access to Component Document.createElement(), Node.replaceChild() Document.createElement(), Document.createElementNS(), Node.replaceChild() Document.createElement(), Document.createElementNS(), Node.replaceChild()
3.3 Data Binding into Host Environment Document.getElementById() (DOM 1 HTML) Document.getElementById(), XPath query Document.getElementById(), XPath
3.3 Data Binding into Host Environment Document.createAttribute(), Element.SetAttribute(), Node.nodeValue Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS(), Node.nodeValue Document.createAttribute(), Document.createAttributeNS(), Element.SetAttribute(), Element.SetAttributeNS() Node.nodeValue
3.3 Data Binding into Host Environment Node.nodeValue Node.nodeValue Node.nodeValue
4.1 Navigation to Target Node Document.getElementById() (DOM 1 HTML) Document.getElementById() Document.getElementById(), DOM Level 3 XPath specification
5.1 Event Propagation (not specified) ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism ability to attach event handlers at various phases, ability to define default event handlers, standardized bubbling mechanism, adds stopPropagation() and stopImmediatePropagation()
5.2 Event Creation (not specified) DocumentEvent interface DocumentEvent interface
5.3 Event Handler Registration (not specified) Node.EventListener interface Node.EventListener, namespace and event category support through addEventListenerNS()