27 January 2004

1. Validation

Editors:
Ben Chang, Oracle
Joe Kesselman, IBM (until September 2001)
Rezaur Rahman, Intel Corporation (until July 2001)

Table of contents

1.1 Overview

This chapter describes the optional DOM Level 3 Validation feature. This module provides Application Programming Interfaces (APIs) to guide construction and editing of XML documents. Examples of such guided editing are queries like those that combine questions like "what does the schema allow me to insert/delete here" and "if I insert/delete here, will the document still be valid."

To aid users in the editing and creation of XML documents, other queries may expose different levels of details, e.g., all the possible children, those which would be valid given what precedes this point, lists of defined symbols of a given kind. Some of these queries would prompt checks and warn users if they're about to conflict with or overwrite such data.

Finally, users would like to validate an edited or newly constructed document before serializing it or passing it to other users. They may edit, come up with an invalid document, then edit again to result in a valid document. During this process, these APIs can allow the user to check the validity of the document or subtree on demand. If necessary, these APIs can also require that the document or subtree remain valid during this editing process via the DocumentEditVal.continuousValidityChecking flag.

A DOM application can use the hasFeature(feature, version) method of the DOMImplementation interface to determine with parameter values "Validation" and "3.0", respectively, whether or not these interfaces are supported by the implementation. This implementation is dependent on [DOM Level 2 Core] and the [DOM Level 3 Core] DOMConfiguration interface.

This chapter focuses on the editing aspects used in the XML document editing world and usage of such information. The appendix describes in detail all the possible outcomes of the validation operations on the different node types.

1.2 Exceptions

This section describes the "VAL-DOC-EDIT" exceptions.

Exception ExceptionVAL

Some Validation operations may throw an ExceptionVAL as described in their descriptions.


IDL Definition
exception ExceptionVAL {
  unsigned short   code;
};
// ExceptionVALCode
const unsigned short      NO_SCHEMA_AVAILABLE_ERR        = 71;

Definition group ExceptionVALCode

An integer indicating the type of error generated.

Defined Constants
NO_SCHEMA_AVAILABLE_ERR
This error occurs when the operation cannot complete due to an unavailable schema.

1.3 Document Editing Interfaces

This section contains "Document Editing" methods as described in the DocumentEditVAL, NodeEditVAL, ElementEditVAL, and CharacterDataEditVAL interfaces. References to new [DOM Level 3 Core] interfaces such as DOMStringList and NameList also exist. With the latter interface, if the schema is a DTD, the element information item names are simply local names; if the schema is a W3C XML schema, the names are qualified names, which may contain namespace prefixes.

Interface DocumentEditVAL

This interface extends the NodeEditVAL interface with additional methods for document editing. An object implementing this interface must also implement the Document interface.


IDL Definition
interface DocumentEditVAL : NodeEditVAL {
           attribute boolean         continuousValidityChecking;
                                        // raises(DOMException, 
                                        //        ExceptionVAL, 
                                        //        DOMException) on setting

  readonly attribute DOMConfiguration domConfig;
  NameList           getDefinedElements(in DOMString namespaceURI);
  unsigned short     validateDocument();
};

Attributes
continuousValidityChecking of type boolean
An attribute specifying whether the validity of the document is continuously enforced. When the attribute is set to true, the implementation may raise certain exceptions, depending on the situation (see the following). This attribute is false by default.
Exceptions on setting

DOMException

NOT_SUPPORTED_ERR: Raised if the implementation does not support setting this attribute to true.

ExceptionVAL

NO_SCHEMA_AVAILABLE_ERR: Raised if this attribute is set to true and a schema is unavailable.

DOMException

VALIDATION_ERR: Raised if an operation makes this document not compliant with the VAL_INCOMPLETE validity type or the document is invalid, and this attribute is set to true.

domConfig of type DOMConfiguration, readonly
This allows the setting of the error handler, as described in the [DOM Level 3 Core] DOMConfiguration interface. An object implementing this DocumentEditVAL interface and the [DOM Level 3 Core] Document interface, which also has a domConfig attribute, needs to only implement this attribute once.
Methods
getDefinedElements
Returns list of all element information item names of global declaration, belonging to the specified namespace.
Parameters
namespaceURI of type DOMString
namespaceURI of namespace. For DTDs, this is null.
Return Value

NameList

List of all element information item names belonging to the specified namespace or null if no schema is available.

No Exceptions
validateDocument
Validates the document against the schema, e.g., a DTD or an W3C XML schema or another. Any attempt to modify any part of the document while validating results in implementation-dependent behavior. In addition, the validation operation itself cannot modify the document, e.g., for default attributes. This method makes use of the error handler, as described in the [DOM Level 3 Core] DOMConfiguration interface, with all errors being SEVERITY_ERROR as defined in the DOMError interface.
Return Value

unsigned short

A validation state constant.

No Parameters
No Exceptions
Interface NodeEditVAL

This interface is similar to the [DOM Level 3 Core] Node interface, with methods for guided document editing.


IDL Definition
interface NodeEditVAL {

  // validationType
  const unsigned short      VAL_WF                         = 1;
  const unsigned short      VAL_NS_WF                      = 2;
  const unsigned short      VAL_INCOMPLETE                 = 3;
  const unsigned short      VAL_SCHEMA                     = 4;


  // validationState
  const unsigned short      VAL_TRUE                       = 5;
  const unsigned short      VAL_FALSE                      = 6;
  const unsigned short      VAL_UNKNOWN                    = 7;

  readonly attribute DOMString       defaultValue;
  readonly attribute DOMStringList   enumeratedValues;
  unsigned short     canInsertBefore(in Node newChild, 
                                     in Node refChild);
  unsigned short     canRemoveChild(in Node oldChild);
  unsigned short     canReplaceChild(in Node newChild, 
                                     in Node oldChild);
  unsigned short     canAppendChild(in Node newChild);
  unsigned short     nodeValidity(in unsigned short valType);
};

Definition group validationType

An integer indicating the validation type. Other specifications can define stricter validation types/constants by extending the NodeEditVAL interface.

Defined Constants
VAL_INCOMPLETE
Check if the node's immediate children are those expected by the content model. This node's trailing required children could be missing. It includes VAL_NS_WF.
VAL_NS_WF
Check if the node is namespace well-formed.
VAL_SCHEMA
Check if the node's entire subtree are those expected by the content model. It includes VAL_NS_WF.
VAL_WF
Check if the node is well-formed.
Definition group validationState

An integer indicating the validation state, or whether the operation can or cannot be done.

Defined Constants
VAL_FALSE
False if the node is invalid with regards to the operation, or if the operation cannot be done.
VAL_TRUE
True if the node is valid with regards to the operation, or if the operation can be done.
VAL_UNKNOWN
The validity of the node is unknown.
Attributes
defaultValue of type DOMString, readonly
The default value specified in an attribute or an element declaration or null if unspecified. If the schema is a W3C XML schema, this is the canonical lexical representation of the default value.
enumeratedValues of type DOMStringList, readonly
A DOMStringList, as described in [DOM Level 3 Core], of distinct values for an attribute or an element declaration or null if unspecified. If the schema is a W3C XML schema, this is a list of strings which are lexical representations corresponding to the values in the [value] property of the enumeration component for the type of the attribute or element. It is recommended that the canonical lexical representations of the values be used.
Methods
canAppendChild
Determines whether the Node.appendChild operation would make this document not compliant with the VAL_INCOMPLETE validity type.
Parameters
newChild of type Node
Node to be appended.
Return Value

unsigned short

A validation state constant.

No Exceptions
canInsertBefore
Determines whether the Node.insertBefore operation would make this document not compliant with the VAL_INCOMPLETE validity type.
Parameters
newChild of type Node
Node to be inserted.
refChild of type Node
Reference Node.
Return Value

unsigned short

A validation state constant.

No Exceptions
canRemoveChild
Determines whether the Node.removeChild operation would make this document not compliant with the VAL_INCOMPLETE validity type.
Parameters
oldChild of type Node
Node to be removed.
Return Value

unsigned short

A validation state constant.

No Exceptions
canReplaceChild
Determines whether the Node.replaceChild operation would make this document not compliant with the VAL_INCOMPLETE validity type.
Parameters
newChild of type Node
New Node.
oldChild of type Node
Node to be replaced.
Return Value

unsigned short

A validation state constant.

No Exceptions
nodeValidity
Determines if the node is valid relative to the validation type specified in valType. This operation doesn't normalize before checking if it is valid. To do so, one would need to explicitly call a normalize method. The difference between this method and the DocumentEditVAL.validateDocument method is that the latter method only checks to determine whether the entire document is valid.
Parameters
valType of type unsigned short
Flag to indicate the validation type checking to be done.
Return Value

unsigned short

A validation state constant.

No Exceptions
Interface ElementEditVAL

This interface extends the NodeEditVAL interface with additional methods for guided document editing. An object implementing this interface must also implement the Element interface.

This interface also has attributes that are a NameList of elements or attributes which can appear in the specified context. Some schema languages, i.e., W3C XML schema, define wildcards which provide for validation of attribute and element information items dependent on their namespace names but independent of their local names.

To expose wildcards, the NameList returns the values that represent the namespace constraint:

  • {namespaceURI, name} is {null, ##any} if any;
  • {namespaceURI, name} is {namespace_a, ##other} if not and a namespace name (namespace_a);
  • {namespaceURI, name} is {null, ##other} if not and absent;
  • Pairs of {namespaceURI, name} with values {a_namespaceURI | null, null} if a set whose members are either namespace names or absent.

IDL Definition
interface ElementEditVAL : NodeEditVAL {

  // ContentTypeVAL
  const unsigned short      VAL_EMPTY_CONTENTTYPE          = 1;
  const unsigned short      VAL_ANY_CONTENTTYPE            = 2;
  const unsigned short      VAL_MIXED_CONTENTTYPE          = 3;
  const unsigned short      VAL_ELEMENTS_CONTENTTYPE       = 4;
  const unsigned short      VAL_SIMPLE_CONTENTTYPE         = 5;

  readonly attribute NameList        allowedChildren;
  readonly attribute NameList        allowedFirstChildren;
  readonly attribute NameList        allowedParents;
  readonly attribute NameList        allowedNextSiblings;
  readonly attribute NameList        allowedPreviousSiblings;
  readonly attribute NameList        allowedAttributes;
  readonly attribute NameList        requiredAttributes;
  readonly attribute unsigned short  contentType;
  unsigned short     canSetTextContent(in DOMString possibleTextContent);
  unsigned short     canSetAttribute(in DOMString attrname, 
                                     in DOMString attrval);
  unsigned short     canSetAttributeNode(in Attr attrNode);
  unsigned short     canSetAttributeNS(in DOMString namespaceURI, 
                                       in DOMString qualifiedName, 
                                       in DOMString value);
  unsigned short     canRemoveAttribute(in DOMString attrname);
  unsigned short     canRemoveAttributeNS(in DOMString namespaceURI, 
                                          in DOMString localName);
  unsigned short     canRemoveAttributeNode(in Node attrNode);
  unsigned short     isElementDefined(in DOMString name);
  unsigned short     isElementDefinedNS(in DOMString namespaceURI, 
                                        in DOMString name);
};

Definition group ContentTypeVAL

An integer indicating the content type of an element.

Defined Constants
VAL_ANY_CONTENTTYPE
The content model contains unordered child information item(s), i.e., element, processing instruction, unexpanded entity reference, character, and comment information items as defined in the XML Information Set. If the schema is a DTD, this corresponds to the ANY content model.
VAL_ELEMENTS_CONTENTTYPE
The content model contains a sequence of element information items optionally separated by whitespace. If the schema is a DTD, this is the element content content model; and if the schema is a W3C XML schema, this is the element-only content type.
VAL_EMPTY_CONTENTTYPE
The content model does not allow any content. If the schema is a W3C XML schema, this corresponds to the empty content type; and if the schema is a DTD, this corresponds to the EMPTY content model.
VAL_MIXED_CONTENTTYPE
The content model contains a sequence of ordered element information items optionally interspersed with character data. If the schema is a W3C XML schema, this corresponds to the mixed content type.
VAL_SIMPLE_CONTENTTYPE
The content model contains character information items. If the schema is a W3C XML schema, then the element has a content type of VAL_SIMPLE_CONTENTTYPE if the type of the element is a simple type definition, or the type of the element is a complexType whose {content type} is a simple type definition.
Attributes
allowedAttributes of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all possible attribute information items or wildcards that can appear as attributes of this element, or null if this element has no context or schema. Duplicate pairs of {namespaceURI, name} are eliminated.
allowedChildren of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all possible element information items or wildcards that can appear as children of this element, or null if this element has no context or schema. Duplicate pairs of {namespaceURI, name} are eliminated.
allowedFirstChildren of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all possible element information items or wildcards that can appear as a first child of this element, or null if this element has no context or schema. Duplicate pairs of {namespaceURI, name} are eliminated.
allowedNextSiblings of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all element information items or wildcards that can be inserted as a next sibling of this element, or null if this element has no context or schema. Duplicate pairs of {namespaceURI, name} are eliminated.
allowedParents of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all possible element information items that can appear as a parent this element, or null if this element has no context or schema.
allowedPreviousSiblings of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of all element information items or wildcards that can be inserted as a previous sibling of this element, or null if this element has no context or schema.
contentType of type unsigned short, readonly
The content type of an element as defined above.
requiredAttributes of type NameList, readonly
A NameList, as described in [DOM Level 3 Core], of required attribute information items that must appear on this element, or null if this element has no context or schema.
Methods
canRemoveAttribute
Verifies if an attribute by the given name can be removed.
Parameters
attrname of type DOMString
Name of attribute.
Return Value

unsigned short

A validation state constant.

No Exceptions
canRemoveAttributeNS
Verifies if an attribute by the given local name and namespace can be removed.
Parameters
namespaceURI of type DOMString
The namespace URI of the attribute to remove.
localName of type DOMString
Local name of the attribute to be removed.
Return Value

unsigned short

A validation state constant.

No Exceptions
canRemoveAttributeNode
Determines if an attribute node can be removed.
Parameters
attrNode of type Node
The Attr node to remove from the attribute list.
Return Value

unsigned short

A validation state constant.

No Exceptions
canSetAttribute
Determines if the value for specified attribute can be set.
Parameters
attrname of type DOMString
Name of attribute.
attrval of type DOMString
Value to be assigned to the attribute.
Return Value

unsigned short

A validation state constant.

No Exceptions
canSetAttributeNS
Determines if the attribute with given namespace and qualified name can be created if not already present in the attribute list of the element. If the attribute with the same qualified name and namespaceURI is already present in the element's attribute list, it tests whether the value of the attribute and its prefix can be set to the new value.
Parameters
namespaceURI of type DOMString
namespaceURI of namespace.
qualifiedName of type DOMString
Qualified name of attribute.
value of type DOMString
Value to be assigned to the attribute.
Return Value

unsigned short

A validation state constant.

No Exceptions
canSetAttributeNode
Determines if an attribute node can be added.
Parameters
attrNode of type Attr
Node in which the attribute can possibly be set.
Return Value

unsigned short

A validation state constant.

No Exceptions
canSetTextContent
Determines if the text content of this node and its descendants can be set to the string passed in.
Parameters
possibleTextContent of type DOMString
Possible text content string.
Return Value

unsigned short

A validation state constant.

No Exceptions
isElementDefined
Determines if name is defined in the schema. This only applies to global declarations. This method is for non-namespace aware schemas.
Parameters
name of type DOMString
Name of element.
Return Value

unsigned short

A validation state constant.

No Exceptions
isElementDefinedNS
Determines if name in this namespace is defined in the current context. Thus not only does this apply to global declarations. but depending on the content, this may also apply to local definitions. This method is for namespace aware schemas.
Parameters
namespaceURI of type DOMString
namespaceURI of namespace.
name of type DOMString
Name of element.
Return Value

unsigned short

A validation state constant.

No Exceptions
Interface CharacterDataEditVAL

This interface extends the NodeEditVAL interface with additional methods for document editing. An object implementing this interface must also implement CharacterData interface. When validating CharacterData nodes, the NodeEditVAL.nodeValidity operation must find the nearest parent node in order to do this; if no parent node is found, VAL_UNKNOWN is returned. In addition, when VAL_INCOMPLETE is passed in as an argument to the NodeEditVAL.nodeValidity operation to operate on such nodes, the operation considers all the text and not just some of it.


IDL Definition
interface CharacterDataEditVAL : NodeEditVAL {
  unsigned short     isWhitespaceOnly();
  unsigned short     canSetData(in DOMString arg);
  unsigned short     canAppendData(in DOMString arg);
  unsigned short     canReplaceData(in unsigned long offset, 
                                    in unsigned long count, 
                                    in DOMString arg)
                                        raises(DOMException);
  unsigned short     canInsertData(in unsigned long offset, 
                                   in DOMString arg)
                                        raises(DOMException);
  unsigned short     canDeleteData(in unsigned long offset, 
                                   in unsigned long count)
                                        raises(DOMException);
};

Methods
canAppendData
Determines if character data can be appended.
Parameters
arg of type DOMString
Data to be appended.
Return Value

unsigned short

A validation state constant.

No Exceptions
canDeleteData
Determines if character data can be deleted.
Parameters
offset of type unsigned long
Offset.
count of type unsigned long
Number of 16-bit units to delete.
Return Value

unsigned short

A validation state constant.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of 16-bit units in data, or if the specified count is negative.

canInsertData
Determines if character data can be inserted.
Parameters
offset of type unsigned long
Offset.
arg of type DOMString
Argument to be set.
Return Value

unsigned short

A validation state constant.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of 16-bit units in data.

canReplaceData
Determines if character data can be replaced.
Parameters
offset of type unsigned long
Offset.
count of type unsigned long
Replacement.
arg of type DOMString
Argument to be set.
Return Value

unsigned short

A validation state constant.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of 16-bit units in data, or if the specified count is negative.

canSetData
Determines if character data can be set.
Parameters
arg of type DOMString
Argument to be set.
Return Value

unsigned short

A validation state constant.

No Exceptions
isWhitespaceOnly
Determines if character data is only whitespace.
Return Value

unsigned short

A validation state constant.

No Parameters
No Exceptions