math
ElementMathML is an application of [XML], Extensible Markup Language, and as such it is governed by the rules of XML syntax. XML syntax is a notation for rooted labeled planar trees. Planarity means that the children of a node may be viewed as given a natural order and MathML depends on this order.
The basic ‘syntax’ of MathML is thus defined by XML. Upon this, we layer a ‘grammar’, being the rules for allowed elements, the order in which they can appear, and how they may be contained within each other, as well as additional syntactic rules for the values of attributes. These rules are defined by this specification, and formalized by a RelaxNG schema [RELAX-NG]. The RelaxNG Schema is normative, but a DTD (Document Type Definition) and an XML Schema [XMLSchemas] are provided for continuity (they were normative for MathML2). See Appendix A Parsing MathML.
As an XML vocabulary, MathML's character set must consist of legal characters as specified by Unicode [Unicode]. The use of Unicode characters for mathematics is discussed in Chapter 7 Characters, Entities and Fonts.
The following sections discuss the general aspects of the MathML grammar as well as describe the syntaxes used for attribute values.
An XML namespace [Namespaces] is a collection of names identified by a URI. The URI for the MathML namespace is:
http://www.w3.org/1998/Math/MathML
To declare a namespace, one uses an xmlns
attribute, or an attribute with an xmlns
prefix.
When the xmlns
attribute is used alone, it sets
the default namespace for the element on which it
appears, and for any child elements. For example:
<math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow>...</mrow> </math>
When the xmlns
attribute is used as a
prefix, it declares a prefix which can then be used to explicitly associate other elements
and attributes with a particular namespace.
When embedding MathML within XHTML, one might use:
<body xmlns:m="http://www.w3.org/1998/Math/MathML"> ... <m:math><m:mrow>...</m:mrow></m:math> ... </body>
Most MathML elements act as ‘containers’; such an element's
children are not distinguished from each other except as individual members of the
list of children. Commonly there is no limit imposed on the number of children
an element may have. This is the case for most presentation
elements and some content elements such as set
.
But many
MathML elements require a specific number of children, or
attach a particular meaning to children in certain positions.
Such elements are best considered to represent constructors of mathematical
objects, and hence thought of as functions of their children. Therefore
children of such a MathML element
will often be referred to as its arguments instead of merely as children.
Examples of this can be found, say, in Section 3.1.3 Required Arguments.
There are presentation elements that conceptually accept only
a single argument, but which for convenience have been written to accept any number of children;
then we infer an mrow
containing those children which acts as
the argument to the element in question; see Section 3.1.3.1 Inferred <mrow>
s.
In the detailed discussions of element syntax given with each element throughout the MathML specification, the correspondence of children with arguments, the number of arguments required and their order, as well as other constraints on the content, are specified. This information is also tabulated for the presentation elements in Section 3.1.3 Required Arguments.
MathML presentation elements only recommend (i.e., do not require) specific ways of rendering; this is in order to allow for medium-dependent rendering and for individual preferences of style.
Nevertheless, some parts of this specification describe these recommended visual rendering rules in detail; in those descriptions it is often assumed that the model of rendering used supports the concepts of a well-defined 'current rendering environment' which, in particular, specifies a 'current font', a 'current display' (for pixel size) and a 'current baseline'. The 'current font' provides certain metric properties and an encoding of glyphs.
MathML elements take attributes with values that further specialize
the meaning or effect of the element. Attribute names are shown in a
monospaced
font throughout this document. The meanings of attributes and their
allowed values are described within the specification of each element.
The syntax notation explained in this section is used in specifying allowed values.
Except when explicitly forbidden by the specification for an attribute, MathML attribute values may contain any legal characters specified by the XML recommendation. See Chapter 7 Characters, Entities and Fonts for further clarification.
To describe the MathML-specific syntax of attribute values, the following conventions and notations are used for most attributes in the present document. We use below the notation beginning with U+ that is recommended by Unicode for referring to Unicode characters [see [Unicode], page xxviii].
Notation | What it matches |
---|---|
decimal-digit | a decimal digit from the range U+0030 to U+0039 |
hexadecimal-digit | a hexadecimal (base 16) digit from the ranges U+0030 to U+0039, U+0041 to U+0046 and U+0061 to U+0066 |
unsigned-integer | a string of decimal-digits, representing a non-negative integer |
positive-integer | a string of decimal-digits, but not consisting solely of "0"s (U+0030), representing a positive integer |
integer | an optional "-" (U+002D), followed by a string of decimal digits, and representing an integer |
unsigned-number | a string of decimal digits with up to one decimal point (U+002E), representing a non-negative terminating decimal number (a type of rational number) |
number | an optional prefix of "-" (U+002D), followed by an unsigned number, representing a terminating decimal number (a type of rational number) |
character | a single non-whitespace character |
string | an arbitrary, nonempty and finite, string of characters |
length | a length, as explained below, Section 2.1.5.2 Length Valued Attributes |
unit | a unit, typically used as part of a length, as explained below, Section 2.1.5.2 Length Valued Attributes |
namedlength | a named length, as explained below, Section 2.1.5.2 Length Valued Attributes |
color | a color, as explained below, Section 2.1.5.3 Color Valued Attributes |
id | an identifier, unique within the document; must satisfy the NAME syntax of the XML recommendation [XML] |
idref | an identifier referring to another element within the document; must satisfy the NAME syntax of the XML recommendation [XML] |
URI | a Uniform Resource Identifier [RFC3986]. Note that the attribute value is typed in the schema as anyURI which allows any sequence of XML characters. Systems needing to use this string as a URI must encode the bytes of the UTF-8 encoding of any characters not allowed in URI using %HH encoding where HH are the byte value in hexadecimal. This ensures that such an attribute value may be interpreted as an IRI, or more generally a LEIRI, see [IRI]. |
italicized word | values as explained in the text for each attribute; see Section 2.1.5.4 Default values of attributes |
"literal" | quoted symbol, literally present in the attribute value (e.g. "+" or '+') |
The ‘types’ described above, except for string, may be combined into composite patterns using the following operators. The whole attribute value must be delimited by single (') or double (") quotation marks in the marked up document. Note that double quotation marks are often used in this specification to mark up literal expressions; an example is the "-" in line 5 of the table above.
In the table below a form f means an instance of a type described in the table above. The combining operators are shown in order of precedence from highest to lowest:
Notation | What it matches |
---|---|
( f ) | same as f |
f? |
an optional instance of f |
f * |
zero or more instances of f, with separating whitespace characters |
f + | one or more instances of f, with separating whitespace characters |
f1 f2 ... fn | one instance of each form fi, in sequence, with no separating whitespace |
f1, f2, ..., fn | one instance of each form fi, in sequence, with separating whitespace characters (but no commas) |
f1 | f2 | ... | fn | any one of the specified forms fi |
The notation we have chosen here is in the style of the syntactical notation of the RelaxNG used for MathML's basic schema, Appendix A Parsing MathML.
Since some applications are inconsistent about normalization of whitespace, for maximum interoperability it is advisable to use only a single whitespace character for separating parts of a value. Moreover, leading and trailing whitespace in attribute values should be avoided.
For most numerical attributes, only those in a subset of the expressible values are sensible; values outside this subset are not errors, unless otherwise specified, but rather are rounded up or down (at the discretion of the renderer) to the closest value within the allowed subset. The set of allowed values may depend on the renderer, and is not specified by MathML.
If a numerical value within an attribute value syntax description
is declared to allow a minus sign ('-'), e.g., number
or
integer
, it is not a syntax error when one is provided in
cases where a negative value is not sensible. Instead, the value
should be handled by the processing application as described in the
preceding paragraph. An explicit plus sign ('+') is not allowed as
part of a numerical value except when it is specifically listed in the
syntax (as a quoted '+' or "+"), and its presence can change the
meaning of the attribute value (as documented with each attribute
which permits it).
Most presentation elements have attributes that accept values representing lengths to be used for size, spacing or similar properties. The syntax of a length is specified as
Type | Syntax |
---|---|
length | number | number unit | namedspace |
There should be no space between the number and the unit of a length.
The possible units and namedspaces, along with their interpretations, are
shown below. Note that although the units and their meanings are taken from
CSS, the syntax of lengths is not identical. A few MathML elements
have length attributes that accept additional keywords; these are termed pseudo-units
and specified
in the description of those particular elements; see, for instance, Section 3.3.6 Adjust Space Around Content
<mpadded>
.
A trailing "%" represents a percent of the default value. The default value, or how it is obtained, is listed in the table of attributes for each element. (See also Section 2.1.5.4 Default values of attributes.) A number without a unit is intepreted as a multiple of the default value. This form is primarily for backward compatibility and should be avoided, prefering explicit units for clarity.
In some cases, the range of acceptable values for a particular attribute may be restricted; implementations are free to round up or down to the closest allowable value.
The possible units in MathML are:
Unit | Description |
---|---|
em |
an em (font-relative unit traditionally used for horizontal lengths) |
ex |
an ex (font-relative unit traditionally used for vertical lengths) |
px |
pixels, or size of a pixel in the current display |
in |
inches (1 inch = 2.54 centimeters) |
cm |
centimeters |
mm |
millimeters |
pt |
points (1 point = 1/72 inch) |
pc |
picas (1 pica = 12 points) |
% |
percentage of the default value |
Some additional aspects of units are discussed further below, in Section 2.1.5.2.1 Additional notes about units.
The following constants, namedspaces, may also be used where a length is needed; they are typically used for spacing or padding between tokens. Recommended default values for these constants are shown; the actual spacing used is implementation specific.
namedspace | Recommended default |
---|---|
veryverythinmathspace |
1/18em |
verythinmathspace |
2/18em |
thinmathspace |
3/18em |
mediummathspace |
4/18em |
thickmathspace |
5/18em |
verythickmathspace |
6/18em |
veryverythickmathspace |
7/18em |
negativeveryverythinmathspace |
-1/18em |
negativeverythinmathspace |
-2/18em |
negativethinmathspace |
-3/18em |
negativemediummathspace |
-4/18em |
negativethickmathspace |
-5/18em |
negativeverythickmathspace |
-6/18em |
negativeveryverythickmathspace |
-7/18em |
Lengths are only used in MathML for presentation, and presentation
will ultimately involve rendering in or on some medium. For visual media,
the display context is assumed to have certain properties available to
the rendering agent. A px
corresponds to a pixel on the display, to
the extent that is meaningful. The resolution of the display device
will affect the correspondence of pixels to the units
in
, cm
, mm
, pt
and pc
.
Moreover, the display context will also provide a default for the font size;
the parameters of this font determine the initial values used to interpret
the units em
and ex
, and thus indirectly the sizes
of namedspaces. Since these units track the display context, and in particular,
the user's preferences for display, the relative units em
and ex
are generally to be preferred over absolute units such as px
or cm
.
Two additional aspects of relative units must be clarified, however.
First, some elements such as Section 3.4 Script and Limit Schemata or mfrac
,
implicitly switch to smaller font sizes for some of their arguments.
Similarly, mstyle
can be used to explicitly change
the current font size. In such cases, the effective values of
an em
or ex
inside those contexts will be
different than outside. The second point is that the effective value
of an em
or ex
used for an attribute value
can be affected by changes to the current font size.
Thus, attributes that affect the current font size,
such as mathsize
and scriptlevel
, must be processed before
evaluating other length valued attributes.
If, and how, lengths might affect non-visual media is implementation specific.
The color, or background color, of presentation elements may be specified as a color using the following syntax:
Type | Syntax |
---|---|
color | #RGB | #RRGGBB | html-color-name |
A color is specified either by "#" followed
by hexadecimal values for the red, green, and blue components,
with no intervening whitespace, or by an html-color-name.
The color components can be either 1-digit or 2-digit, but
must all have the same number of digits; the component
ranges from 0 (component not present) to FF
(component fully present).
Note that, for example, by the digit-doubling rule specified under Colors in
[CSS21]
#123
is a short form for #112233
.
Color values can also be specified as an html-color-name, one of the color-name keywords defined in [HTML4] ("aqua", "black", "blue", "fuchsia", "gray", "green", "lime", "maroon", "navy", "olive", "purple", "red", "silver", "teal", "white", and "yellow"). Note that the color name keywords are not case-sensitive, unlike most keywords in MathML attribute values, for compatibility with CSS and HTML.
When a color is applied to an element,
it is the color in which the content of tokens is rendered.
Additionally, when inherited from a surrounding element or from the environment in which the complete MathML expression is
embedded, it controls the color of
all other drawing due to MathML elements, including the lines
or radical signs that can be drawn in rendering mfrac
, mtable
, or
msqrt
.
When used to specify a background color, the keyword "transparent"
is also allowed.
The recommended MathML visual rendering rules do not define the
precise extent of the region whose background is affected by using the
background
attribute on an element,
except that, when the element's content does not have
negative dimensions and its drawing region is not overlapped by other
drawing due to surrounding negative spacing, this region should lie
behind all the drawing done to render the content of the
element, but should not lie behind any of the
drawing done to render surrounding expressions. The effect of overlap
of drawing regions caused by negative spacing on the extent of the
region affected by the background
attribute is not
defined by these rules.
Default values for MathML attributes are, in general, given along with the detailed descriptions of specific elements in the text. Default values shown in plain text in the tables of attributes for an element are literal, but when italicized are descriptions of how default values can be computed.
Default values described as inherited are taken from the
rendering environment, as described in Section 3.3.4 Style Change <mstyle>
,
or in some cases (which are described individually) taken from the values of other
attributes of surrounding elements, or from certain parts of those
values. The value used will always be one which could have been specified
explicitly, had it been known; it will never depend on the content or
attributes of the same element, only on its environment. (What it means
when used may, however, depend on those attributes or the content.)
Default values described as automatic should be computed by a MathML renderer in a way which will produce a high-quality rendering; how to do this is not usually specified by the MathML specification. The value computed will always be one which could have been specified explicitly, had it been known, but it will usually depend on the element content and possibly on the context in which the element is rendered.
Other italicized descriptions of default values which appear in the tables of attributes are explained individually for each attribute.
The single or double quotes which are required around attribute values in an XML start tag are not shown in the tables of attribute value syntax for each element, but are around attribute values in examples in the text, so that the pieces of code shown are correct.
Note that, in general, there is no mechanism in MathML to simulate the
effect of not specifying attributes which are inherited or
automatic. Giving the words "inherited" or
"automatic" explicitly will not work, and is not generally
allowed. Furthermore, the mstyle
element (Section 3.3.4 Style Change <mstyle>
)
can even be used to change the default values of presentation attributes
for its children.
Note also that these defaults describe the behavior of MathML applications when an attribute is not supplied; they do not indicate a value that will be filled in by an XML parser, as is sometimes mandated by DTD-based specifications.
In addition to the attributes described specifically for each element,
the attributes in the following table are allowed on every MathML element.
Also allowed are attributes from the xml namespace, such as xml:lang
,
and attributes from namespaces other than MathML,
which are ignored by default.
Name | values | default |
---|---|---|
id | id | none |
Establishes a unique identifier associated with the element
to support linking, cross-references and parallel markup.
See xref and Section 5.4 Parallel Markup.
|
||
xref | idref | none |
References another element within the document.
See id and Section 5.4 Parallel Markup.
|
||
class | string | none |
Associates the element with a set of style classes for use with [XSLT] and [CSS21]. Typically this would be a space separated sequence of words, but this is not specified by MathML. See Section 6.5 Using CSS with MathML for discussion of the interaction of MathML and CSS. | ||
style | string | none |
Associates style information with the element for use with [XSLT] and [CSS21]. This typically would be an inline CSS style, but this is not specified by MathML. See Section 6.5 Using CSS with MathML for discussion of the interaction of MathML and CSS. | ||
href | URI | none |
Can be used to establish the element as a hyperlink to the specfied URI. |
Note that MathML 2 had no direct support for linking, and instead
followed the W3C Recommendation "XML Linking Language"
[XLink] in defining links using the
xlink:href
attribute. This has changed, and MathML 3 now
uses an href
attribute. However, particular compound
document formats may specify the use of XML linking with MathML
elements, so user agents that support XML linking should continue to
support the use of the xlink:href
attribute with MathML 3
as well.
See also Section 3.2.2 Mathematics style attributes common to token elements for a list of MathML attributes which can be used on most presentation token elements.
The attribute other
,
is deprecated
(Section 2.3.3 Attributes for unspecified data) in favor of the use of
attributes from other namespaces.
Name | values | default |
---|---|---|
other | string | none |
DEPRECATED but in MathML 1.0. |
In MathML, as in XML, "whitespace" means simple spaces, tabs, newlines, or carriage returns, i.e., characters with hexadecimal Unicode codes U+0020, U+0009, U+000A, or U+000D, respectively; see also the discussion of whitespace in Section 2.3 of [XML].
MathML ignores whitespace occurring outside token elements.
Non-whitespace characters are not allowed there. Whitespace occurring
within the content of token elements , except for <cs>
, is normalized as follows. All whitespace at the beginning and end of the content is
removed, and whitespace internal to content of the element is
collapsed canonically, i.e., each sequence of 1 or more
whitespace characters is replaced with one space character (U+0020, sometimes
called a blank character).
For example, <mo> ( </mo>
is equivalent to
<mo>(</mo>
, and
<mtext> Theorem 1: </mtext>
is equivalent to
<mtext>Theorem 1:</mtext>
or
<mtext>Theorem 1:</mtext>
.
Authors wishing to encode white space characters at the start or end of
the content of a token, or in sequences other than a single space, without
having them ignored, must use
(U+00A0)
or other non-marking characters that are not trimmed.
For example, compare the above use of an mtext
element
with
<mtext>  <!--NO-BREAK SPACE-->Theorem  <!--NO-BREAK SPACE-->1: </mtext>
When the first example is rendered, there is nothing before "Theorem", one Unicode space character between "Theorem" and "1:", and nothing after "1:". In the second example, a single space character is to be rendered before "Theorem"; two spaces, one a Unicode space character and one a Unicode no-break space character, are to be rendered before "1:"; and there is nothing after the "1:".
Note that the value of the xml:space
attribute is not relevant
in this situation since XML processors pass whitespace in tokens to a
MathML processor; it is the requirements of MathML processing which specify that
whitespace is trimmed and collapsed.
For whitespace occurring outside the content of the token elements
mi
, mn
, mo
, ms
, mtext
,
ci
, cn
, cs
, csymbol
and annotation
,
an mspace
element should be used, as opposed to an mtext
element containing
only whitespace entities.
math
Element
MathML specifies a single top-level or root math
element,
which encapsulates each instance of
MathML markup within a document. All other MathML content must be
contained in a math
element; in other words,
every valid MathML expression is wrapped in outer
<math>
tags. The math
element must always be the outermost element in a MathML expression;
it is an error for one math
element to contain
another. These considerations also apply when sub-expressions are
passed between applications, such as for cut-and-paste operations;
See Section 6.3 Transferring MathML.
The math
element can contain an arbitrary number
of child elements. They render by default as if they
were contained in an mrow
element.
The math
element accepts any of the attributes that can be set on
Section 3.3.4 Style Change <mstyle>
, including the common attributes
specified in Section 2.1.6 Attributes Shared by all MathML Elements.
In particular, it accepts the dir
attribute for
setting the overall directionality; the math
element is usually
the most useful place to specify the directionality
(See Section 3.1.5 Directionality for further discussion).
Note that the dir
attribute defaults to "ltr"
on the math
element (but inherits on all other elements
which accept the dir
attribute); this provides for backward
compatibility with MathML 2.0 which had no notion of directionality.
Also, it accepts the mathbackground
attribute in the same sense
as mstyle
and other presentation elements to set the background
color of the bounding box, rather than specifying a default for the attribute
(see Section 3.1.10 Mathematics style attributes common to presentation elements)
In addition to those attributes, the math
element accepts:
Name | values | default |
---|---|---|
display | "block" | "inline" | inline |
specifies whether the enclosed MathML expression should be rendered
as a separate vertical block (in display style)
or inline, aligned with adjacent text.
When display ="block", displaystyle is initialized
to "true",
whereas when display ="inline", displaystyle
is initialized to "false";
in both cases scriptlevel is initialized to 0
(See Section 3.1.6 Displaystyle and Scriptlevel).
Moreover, when the math element is embedded in a larger document,
a block math element should be treated as a block element as appropriate
for the document type (typically as a new vertical block),
whereas an inline math element should be treated as inline
(typically exactly as if it were a sequence of words in normal text).
In particular, this applies to spacing and linebreaking: for instance,
there should not be spaces or line breaks inserted between inline math
and any immediately following punctuation.
When the display attribute is missing, a rendering agent is free to initialize
as appropriate to the context.
|
||
maxwidth | length | available width |
specifies the maximum width to be used for linebreaking. The default is the maximum width available in the surrounding environment. If that value cannot be determined, the renderer should assume an infinite rendering width. | ||
overflow | "linebreak" | "scroll" | "elide" | "truncate" | "scale" | linebreak |
specifies the preferred handing in cases where an expression is too long to fit in the allowed width. See the discussion below. | ||
altimg | URI | none |
provides a URI referring to an image to display as a fall-back for user agents that do not support embedded MathML. | ||
altimg-width | length | width of altimg |
specifies the width to display altimg , scaling the image if necessary;
See altimg-height .
|
||
altimg-height | length | height of altimg |
specifies the height to display altimg , scaling the image if necessary;
if only one of the attributes altimg-width and altimg-height
are given, the scaling should preserve the image's aspect ratio;
if neither attribute is given, the image should be shown at its natural size.
|
||
altimg-valign | length | "top" | "middle" | "bottom" | 0ex |
specifies the vertical alignment of the image with respect to adjacent inline material.
A positive value of altimg-valign shifts the bottom of the image above the
current baseline, while a negative value lowers it.
The keyword "top" aligns the top of the image with the top of adjacent inline material;
"center" aligns the middle of the image to the middle of adjacent material;
"bottom" aligns the bottom of the image to the bottom of adjacent material
(not necessarily the baseline). This attribute only has effect
when display ="inline".
By default, the bottom of the image aligns to the baseline.
|
||
alttext | string | none |
provides a textual alternative as a fall-back for user agents that do not support embedded MathML or images. | ||
cdgroup | URI | none |
specifies a CD group file that acts as a catalogue of CD bases for locating
OpenMath content dictionaries of csymbol , annotation , and
annotation-xml elements in this math element; see Section 4.2.3 Content Symbols <csymbol> . When no cdgroup attribute is explicitly specified, the
document format embedding this math element may provide a method for determining
CD bases. Otherwise the system must determine a CD base; in the absence of specific
information http://www.openmath.org/cd is assumed as the CD base for all
csymbol , annotation , and annotation-xml elements. This is the
CD base for the collection of standard CDs maintained by the OpenMath Society.
|
In cases where size negotiation is not possible or fails
(for example in the case of an expression that is too long to fit in the allowed width),
the overflow
attribute is provided to suggest a processing method to the renderer.
Allowed values are:
Value | Meaning |
---|---|
"linebreak" | The expression will be broken across several lines. See Section 3.1.7 Linebreaking of Expressions for further discussion. |
"scroll" | The window provides a viewport into the larger complete display of the mathematical expression. Horizontal or vertical scroll bars are added to the window as necessary to allow the viewport to be moved to a different position. |
"elide" | The display is abbreviated by removing enough of it so that the remainder fits into the window. For example, a large polynomial might have the first and last terms displayed with "+ ... +" between them. Advanced renderers may provide a facility to zoom in on elided areas. |
"truncate" | The display is abbreviated by simply truncating it at the right and bottom borders. It is recommended that some indication of truncation is made to the viewer. |
"scale" | The fonts used to display the mathematical expression are chosen so that the full expression fits in the window. Note that this only happens if the expression is too large. In the case of a window larger than necessary, the expression is shown at its normal size within the larger window. |
The following attributes of math
are deprecated:
Name | values | default |
---|---|---|
macros | URI * | none |
intended to provide a way of pointing to external macro definition files. Macros are not part of the MathML specification. | ||
mode | "display" | "inline" | inline |
specified whether the enclosed MathML expression should be rendered in
a display style or an inline style.
This attribute is deprecated in
favor of the display attribute.
|
Information nowadays is commonly generated, processed and rendered by software tools. The exponential growth of the Web is fueling the development of advanced systems for automatically searching, categorizing, and interconnecting information. In addition, there are increasing numbers of Web services, some of which offer technically based materials and activities. Thus, although MathML can be written by hand and read by humans, whether machine-aided or just with much concentration, the future of MathML is largely tied to the ability to process it with software tools.
There are many different kinds of MathML processors: editors for authoring MathML expressions, translators for converting to and from other encodings, validators for checking MathML expressions, computation engines that evaluate, manipulate, or compare MathML expressions, and rendering engines that produce visual, aural, or tactile representations of mathematical notation. What it means to support MathML varies widely between applications. For example, the issues that arise with a validating parser are very different from those for an equation editor.
This section gives guidelines that describe different types of MathML support and make clear the extent of MathML support in a given application. Developers, users, and reviewers are encouraged to use these guidelines in characterizing products. The intention behind these guidelines is to facilitate reuse by and interoperability of MathML applications by accurately setting out their capabilities in quantifiable terms.
The W3C Math Working Group maintains MathML Compliance Guidelines. Consult this document for future updates on conformance activities and resources.
A valid MathML expression is an XML construct determined by the MathML RelaxNG Schema together with the additional requirements given in this specification.
We shall use the phrase "a MathML processor" to mean any application that can accept or produce a valid MathML expression. A MathML processor that both accepts and produces valid MathML expressions may be able to "round-trip" MathML. Perhaps the simplest example of an application that might round-trip a MathML expression would be an editor that writes it to a new file without modifications.
Three forms of MathML conformance are specified:
A MathML-input-conformant processor must accept all valid MathML expressions; it should appropriately translate all MathML expressions into application-specific form allowing native application operations to be performed.
A MathML-output-conformant processor must generate valid MathML, appropriately representing all application-specific data.
A MathML-round-trip-conformant processor must preserve MathML equivalence. Two MathML expressions are "equivalent" if and only if both expressions have the same interpretation (as stated by the MathML Schema and specification) under any relevant circumstances, by any MathML processor. Equivalence on an element-by-element basis is discussed elsewhere in this document.
Beyond the above definitions, the MathML specification makes no demands of individual processors. In order to guide developers, the MathML specification includes advisory material; for example, there are many recommended rendering rules throughout Chapter 3 Presentation Markup. However, in general, developers are given wide latitude to interpret what kind of MathML implementation is meaningful for their own particular application.
To clarify the difference between conformance and interpretation of what is meaningful, consider some examples:
In order to be MathML-input-conformant, a validating parser needs only to accept expressions, and return "true" for expressions that are valid MathML. In particular, it need not render or interpret the MathML expressions at all.
A MathML computer-algebra interface based on content markup might choose to ignore all presentation markup. Provided the interface accepts all valid MathML expressions including those containing presentation markup, it would be technically correct to characterize the application as MathML-input-conformant.
An equation editor might have an internal data representation that makes it easy to export some equations as MathML but not others. If the editor exports the simple equations as valid MathML, and merely displays an error message to the effect that conversion failed for the others, it is still technically MathML-output-conformant.
As the previous examples show, to be useful, the concept of MathML conformance frequently involves a judgment about what parts of the language are meaningfully implemented, as opposed to parts that are merely processed in a technically correct way with respect to the definitions of conformance. This requires some mechanism for giving a quantitative statement about which parts of MathML are meaningfully implemented by a given application. To this end, the W3C Math Working Group has provided a test suite.
The test suite consists of a large number of MathML expressions categorized by markup category and dominant MathML element being tested. The existence of this test suite makes it possible, for example, to characterize quantitatively the hypothetical computer algebra interface mentioned above by saying that it is a MathML-input-conformant processor which meaningfully implements MathML content markup, including all of the expressions in the content markup section of the test suite.
Developers who choose not to implement parts of the MathML specification in a meaningful way are encouraged to itemize the parts they leave out by referring to specific categories in the test suite.
For MathML-output-conformant processors, information about currently available tools to validate MathML is maintained at the W3C MathML Validator. Developers of MathML-output-conformant processors are encouraged to verify their output using this validator.
Customers of MathML applications who wish to verify claims as to which parts of the MathML specification are implemented by an application are encouraged to use the test suites as a part of their decision processes.
MathML 3.0 contains a number of features of earlier MathML which are now deprecated. The following points define what it means for a feature to be deprecated, and clarify the relation between deprecated features and current MathML conformance.
In order to be MathML-output-conformant, authoring tools may not generate MathML markup containing deprecated features.
In order to be MathML-input-conformant, rendering and reading tools must support deprecated features if they are to be in conformance with MathML 1.x or MathML 2.x. They do not have to support deprecated features to be considered in conformance with MathML 3.0. However, all tools are encouraged to support the old forms as much as possible.
In order to be MathML-round-trip-conformant, a processor need only preserve MathML equivalence on expressions containing no deprecated features.
MathML 3.0 defines three basic extension mechanisms: the mglyph
element provides a way of displaying glyphs for non-Unicode
characters, and glyph variants for existing Unicode characters; the
maction
element uses attributes from other namespaces to obtain
implementation-specific parameters; and content markup makes use of
the definitionURL
attribute, as well as
Content Dictionaries and the cd
attribute, to point to external
definitions of mathematical semantics.
These extension mechanisms are important because they provide a way
of encoding concepts that are beyond the scope of MathML 3.0 as presently
explicitly specified, which
allows MathML to be used for exploring new ideas not yet susceptible
to standardization. However, as new ideas take hold, they may become
part of future standards. For example, an emerging character that
must be represented by an mglyph
element today may be
assigned a Unicode code point in the future. At that time,
representing the character directly by its Unicode code point would be
preferable. This transition into Unicode has
already taken place for hundreds of characters used for mathematics.
Because the possibility of future obsolescence is inherent in the
use of extension mechanisms to facilitate the discussion of new ideas,
MathML can reasonably make
no conformance requirements concerning the use of
extension mechanisms, even when alternative standard markup is
available. For example, using an mglyph
element to represent
an 'x' is permitted. However, authors and implementers are
strongly encouraged to use standard markup whenever possible.
Similarly, maintainers of documents employing MathML 3.0 extension
mechanisms are encouraged to monitor relevant standards activity
(e.g., Unicode, OpenMath, etc.) and to update documents as more
standardized markup becomes available.
If a MathML-input-conformant application receives
input containing one or more elements with an illegal number or type
of attributes or child schemata, it should nonetheless attempt to
render all the input in an intelligible way, i.e., to render normally
those parts of the input that were valid, and to render error messages
(rendered as if enclosed in an merror
element) in place of
invalid expressions.
MathML-output-conformant applications such as
editors and translators may choose to generate merror
expressions to signal errors in their input. This is usually
preferable to generating valid, but possibly erroneous, MathML.
The MathML attributes described in the MathML specification are intended to allow for good presentation and content markup. However it is never possible to cover all users' needs for markup. Ideally, the MathML attributes should be an open-ended list so that users can add specific attributes for specific renderers. However, this cannot be done within the confines of a single XML DTD or in a Schema. Although it can be done using extensions of the standard DTD, say, some authors will wish to use non-standard attributes to take advantage of renderer-specific capabilities while remaining strictly in conformance with the standard DTD.
To allow this, the MathML 1.0 specification [MathML1]
allowed the attribute other
on all elements, for use as a hook to pass
on renderer-specific information. In particular, it was intended as a hook for
passing information to audio renderers, computer algebra systems, and for pattern
matching in future macro/extension mechanisms. The motivation for this approach to
the problem was historical, looking to PostScript, for example, where comments are
widely used to pass information that is not part of PostScript.
In the next period of evolution of MathML the
development of a general XML namespace mechanism
seemed to make the use of the other
attribute obsolete. In MathML 2.0, the other
attribute is
deprecated in favor of the use of
namespace prefixes to identify non-MathML attributes. The
other
attribute remains deprecated in MathML 3.0.
For example, in MathML 1.0, it was recommended that if additional information
was used in a renderer-specific implementation for the maction
element
(Section 3.7.1 Bind Action to Sub-Expression
<maction>
),
that information should be passed in using the other
attribute:
<maction actiontype="highlight" other="color='#ff0000'"> expression </maction>
From MathML 2.0 onwards, a color
attribute from another namespace would be used:
<body xmlns:my="http://www.example.com/MathML/extensions"> ... <maction actiontype="highlight" my:color="#ff0000"> expression </maction> ... </body>
Note that the intent of allowing non-standard attributes is not to encourage software developers to use this as a loophole for circumventing the core conventions for MathML markup. Authors and applications should use non-standard attributes judiciously.