previous   next   contents  

4. SMIL 3.0 Media Object

Editor for SMIL 3.0
Dick Bulterman, CWI
Eric Hyche, RealNetworks.
Editor for SMIL 2.0
Dick Bulterman, CWI
Rob Lanphier, RealNetworks.

Table of contents

4.1 Changes for SMIL 3.0

This section is informative.

There are three major changes to the Media Object modules for SMIL 3.0: the first is the splitting of the SMIL 2.1 MediaParam module into two modules: the MediaParam and MediaRenderAttributes modules; the second is the introduction of the MediaOpacity module, containing new rendering attributes for chroma key and opacity control; the third is the introduction of the MediaPanZoom module. The rationale for these changes is:

  1. The splitting of the SMIL 2.1 MediaParam module provides a better differentiation of functionality, which may help user agent profile designers be more selective in the features they wish to support.
  2. The MediaOpacity module is added to define control over various aspects of media opacity using the mediaOpacity, mediaBackgroundOpacity, chromaKey, chromaKeyOpacity, and chromaKeyTolerance attributes.
  3. The MediaPanZoom module defines the panZoom attribute to provide a framework for panning and zooming over media content. (This attribute is based largely on equivalent functionality in the SVG viewBox attribute.)

The MediaParam module also includes new text that explicitly discusses the behavior of adding the various media control attributes defined in that section to a SMIL layout region definition as a means of providing a global mechanism for applying default attribute settings to all content rendered within that region.

A number of editorial changes have also been integrated into the various Media Object modules descriptions; these do not impact the functionality defined in earlier versions of SMIL.

4.2 Introduction

This section is informative.

This section defines the SMIL media object modules, which are composed of the BasicMedia module and nine modules with additional functionality that build on top of the BasicMedia module: the BrushMedia, MediaClipping, MediaClipMarkers, MediaParam, MediaRenderAttributes, MediaOpacity, MediaAccessibility, MediaDescription, and MediaPanZoom modules. These modules contain elements and attributes used to reference external media objects or control media object rendering behavior. Since these elements and attributes are defined in a series of modules, designers of other markup languages may reuse the SMIL media module when they wish to include media objects into their language.

The differences between current media object functionality and that provided by the SMIL 1.0 specification are explained in Appendix A.

4.3 Definitions

This section is normative.

This section provides convenience definitions for common timing and resource identifier terms used in this module.

SMIL provides a number of timing-related concepts that are used to determine activation, duration and termination of media objects in a presentation. The temporal semantics of these concepts are discussed in the SMIL 3.0 Timing and Synchronization module.

Intrinsic Duration
The duration of a referenced media item based on the temporal properties of that item (defined next), without any explicit SMIL timing markup. Some media objects have a well-defined notion of implicit duration (such as a 7 second audio clip), while other objects do not have well-defined durations (such as a string of plain text). In SMIL, the implicit duration for any media object that does not have a well-defined duration is set to be zero seconds. The implicit duration is used to calculate scheduling information; it is sometimes independent of the actual duration of a media object (such as with a live media stream or with an image with multiple internal frames when no particular duration can be derived by the SMIL scheduler). From a scheduling perspective, an object's intrinsic duration forms the basis for the simple duration of the object during presentation. This duration may be shortened or extended using SMIL timing markup.
Continuous Media
Media objects, such as stored audio or video files, for which there is a measurable and well-understood duration. For example, a five second audio clip is continuous media, because it has a well-understood duration of five seconds. Opposite of "discrete media". See also the definition of continuous media in the Timing module.
Discrete Media
Media objects, such as images or non-timed text data, that has no obvious duration. For example, a JPEG image is generally considered discrete media, because there's nothing in the file indicating how long the JPEG should be displayed. Opposite of "continuous media". See also the definition of discrete media in the Timing module.

The distinction between continuous and discrete media is sometimes arbitrary and may be SMIL renderer dependent. For example, animated images that do not have a well-defined duration (simply a repeating collection of frames) are classified for SMIL scheduling purposes as being discrete media; such objects have an intrinsic scheduling duration of zero seconds.

In this specification, the term URI [URI] refers to a universal resource identifier, as defined in [RFC3986] and subsequently extended under the name IRI in [RFC3987]. In some cases, the term URI has been retained in the specification to avoid using new names for concepts such as "Base URI" that are defined or referenced across a whole family of XML specifications.

4.4 SMIL BasicMedia Module

This section is normative.

This module defines the baseline media functionality of a SMIL player.

4.4.1 Media Object Elements - ref, and its synonyms animation, audio, img, text, textstream and video

SMIL defines a single generic media object element that allows the inclusion of external media objects into a SMIL presentation. Media objects are included by reference (using a IRI).

ref
Generic media reference

In addition to the ref element, SMIL allows the use of the following set of synonyms:

animation
Animated vector graphics or other animated format
audio
Audio clip
img
Still image, such as PNG or JPEG
text
External text reference
textstream
A text document that includes timing information for the purpose of time-dependent rendering of portions of the text document.
video
Video clip

All of these media elements are semantically identical. When playing back an external media object, the player must not derive the exact type of the media object from the name of the media object element. Instead, it must rely solely on other sources about the type, such as the type information communicated by a server or the operating system, or by using type information contained in the type attribute.

This section is informative.

Authors are encouraged to use meaningful synonyms (animation, audio, img, video, text or textstream) when referencing external media objects. This is in order to increase the readability of the SMIL document. Some SMIL implementations may require the use of an element type that matches the information type of the object. When in doubt about the group of a media object, authors should use the generic "ref" element.

The animation element defined here should not be confused with the elements defined in the SMIL 3.0 Animation Module. The animation element defined in this module is used to include an external animation object file (such as a vector graphics animation) by reference. This is in contrast to the elements defined in the Animation module, which provide an in-line syntax for the animation of attributes and properties of other elements.

SMIL 3.0 also supports the smilText element for defining in-line timed text content. This functionality is described in the smilText Modules specification.

Anchors and links may be attached to visual media objects, i.e. media objects rendered on a visual abstract rendering surface.

Attributes Definitions

Languages implementing the SMIL BasicMedia Module must define which attributes may be attached to media object elements. In all languages implementing the SMIL BasicMedia module, media object elements may have the following attributes:

src
The value of the src attribute is the IRI [IRI] of the media element, used for locating and fetching the associated media.

The attribute supports fragment identifiers and the '#' connector in the IRI value. The fragment part is an id value that identifies one of the elements within the referenced media item. With this construct, SMIL 3.0 supports locators as currently used in HTML (that is, it uses locators of the form http://www.example.org/some/path#anchor1), with the difference that the values are of unique identifiers and not the values of "name" attributes. Generally speaking, this type of addressing implies that the target media is of a structured type that supports the concept of id, such as HTML or XML-based languages.

Note that this attribute is not required. A media object with no src attribute has an intrinsic duration of zero, and participates in timing just as any other media element. No media will be fetched by the SMIL implementation for a media element without a src attribute.

type
Content type of the media object referenced by the src attribute. The usage of this attribute depends on the protocol of the src attribute.
RTSP [RTSP]
The type attribute is used for purposes of content selection and when the type of the referenced media is not otherwise available. It may be overridden by the contents of the RTSP DESCRIBE response or by the static RTP payload number.
HTTP [HTTP]
The type attribute is used as an alternative method of content selection and when the type of the referenced media is not otherwise available. It may override the contents of the "Content-type" field in an HTTP exchange only if a user has allowed such overrides, as specified in the TAG Finding Authoritative Metadata [AM]. The nominal precedence order for type resolution is: via the HTTP content-type field, via the type attribute, and then by using other clues (such as file inspection or use of the file extension).
FTP [FTP] and local file playback IRI [URI]
The type attribute value takes precedence over other possible sources of the media type (for instance, the file extension).

When the content represented by a URL is available in many data formats, implementations MAY use the type value to influence which of the multiple formats is used. For instance, on a server implementing HTTP content negotiation, the client may use the type attribute to order the preferences in the negotiation. The type attribute is not intended for use in media sub-stream selection.

For protocols not enumerated in this specification, implementations should use the following rules: When the media is encapsulated in a media file and delivered intact to the SMIL user agent via a protocol designed for delivery as a complete file, the media type as provided by this protocol should take precedence over the type attribute value. For protocols which deliver the media in a media-aware fashion, such as those delivering media in a manner using or dependent upon the specific type of media, the application of the type attribute is not defined by this specification.

Element Content

Languages utilizing the SMIL BasicMedia module must define the complete set of elements which may act as children of media object elements. There are currently no required children of a media object defined in the BasicMedia Module, but languages utilizing the BasicMedia module may impose requirements beyond this specification.

4.4.2 Integration Requirements

If the including profile supports the XMLBase functionality [XMLBase] , the values of the src and longdesc attributes on the media object elements must be interpreted in the context of the relevant XMLBase URI prefix.

User-agent implementations are responsible for defining the rendering behavior when fragment addressing is used in the src attribute. Such definition should be added to language profiles that wish to include specific media addressing features. For example:
- User-agents should define the default behavior for when referencing a non-existing id in the target media document.
- User-agents should define the rendering method for the selected media fragment: in context, with or without highlighting and scrolling, or stand-alone (selective rendering only).
- User-agents should describe the timing implication for when addressing timed-content.

SMIL 3.0 allows but does not require user agents to be able to process XPointer values in the IRI value of the src attribute. The SMIL 3.0 Linking Module provides additional information related to XPointer.

4.5 SMIL MediaParam Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaParam Module definition. The MediaParam module is intended to provide a uniform mechanism for media object initialization. Languages implementing elements and attributes found in the MediaParam module must implement all elements and attributes defined below, as well as BasicMedia.

4.5.1 The param element

The param element allows a general parameter value to be sent to a media object renderer as a name/value pair. This parameter is sent to the renderer at the time that the media object is processed by the scheduler. It is up to the media renderer to associate an action with the given param. The media renderer may choose to ignore any unknown or inappropriate param values (such as sending a font size to an audio object).

Any number of param elements may appear (in any order) in the content of a media object element or in a paramGroup element. If a given parameter is defined multiple times, the lexically last version of that parameter value should be used.

The syntax of names and values is assumed to be understood by the object's implementation. The SMIL specification does not specify how user agents should retrieve name/value pairs.

Attribute definitions
name
(CDATA) This attribute defines the name of a run-time parameter, assumed to be known by the inserted object. Whether the property name is case-sensitive depends on the specific object implementation.
value
(CDATA) This attribute specifies the value of a run-time parameter specified by name. Property values have no meaning to SMIL; their meaning is determined by the object in question.
valuetype
["data"|"ref"|"object"] This attribute specifies the type of the value attribute. Possible values:
  • data This is default value for the attribute. It means that the value specified by value will be evaluated and passed to the object's implementation as a string.
  • ref The value specified by value is a IRI [IRI] that designates a resource where run-time values are stored. This allows support tools to identify URIs given as parameters. The IRI must be passed to the object as is, i.e., unresolved.
  • object The value specified by value is an identifier that refers to a media object declaration in the same document. The identifier must be the value of the id attribute set for the declared media object element.
type
This attribute specifies the content type of the resource designated by the value attribute only in the case where valuetype is set to "ref". This attribute thus specifies for the user agent, the type of values that will be found at the IRI designated by value. See 6.7 Content Type in [HTML4] for more information.

Example

This section is informative.

To illustrate the use of param, suppose that we have a facial animation plug-in that is able to accept different moods and accessories associated with characters. These could be defined in the following way:
<ref src="http://www.example.com/herbert.face">
  <param name="mood" value="surly" valuetype="data"/>
  <param name="accessories" value="baseball-cap,nose-ring" valuetype="data"/>
</ref>

4.5.2 The paramGroup element

The paramGroup element provides a convenience mechanism for defining a collection of media parameters that may be reused with several different media objects. If present, the paramGroup element must appear in the head section of the document. The content of the paramGroup element consists of zero or more param elements. The paramGroup element may not contain nested paramGroup element definitions.

Element attributes

This element does not define any new attributes. Profiles integrating this element must specify an attribute of type ID [XML11] by which the param group is referenced in a media object reference. For SMIL 3.0, the xml:id attribute will typically be used.

Examples

This section is informative.

This section contains several fragments that illustrate uses of the paramGroup element.

In the following fragment, a paramGroup is created to define parameters that are passed to several different media objects:

<smil ... >
  <head>
    ...
    <paramGroup xml:id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown"/>
    ...
    <ref src="http://www.example.com/sally.face" paramGroup="clown"/>
    ...
  </body>
</smil>

In the following example, a media object provides an additional param value:

<smil ... >
  <head>
    ...
    <paramGroup xml:id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown">
      <param name="gender" value="male"/>
    </ref>
    ...
  </body>
</smil>

In this final example, a media object provides a duplicate param value. The behavior in this case depends on the media renderer; all param values are passed to the renderer in the lexical order of the SMIL source file. It is expected that the lexically last value for any parameter sent to the renderer be used, if possible.

<smil ... >
  <head>
    ...
    <paramGroup xml:id="clown">
       <param name="mood" value="upBeat" valuetype="data"/>
       <param name="accessories" value="flowers,dunceCap"/>
    </paramGroup>
    ...
  </head>
  <body>
    ...
    <ref src="http://www.example.com/andy.face" paramGroup="clown">
      <param name="gender" value="male"/>
      <param name="mood" value="depressed" valuetype="data"/>
    </ref>
    ...
  </body>
</smil>

4.5.3 Element Attributes for Media Object Initialization

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may add the media initialization attribute defined below.

paramGroup
Used to specify the name of a paramGroup that was defined in the document head. The value is a single IDREF [XML11] that refers to the ID [XML11] of a paramGroup element. If the named paramGroup does not exist, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

4.5.4 Integration Requirements

Any profile that integrates the functionality of this module is strongly encouraged to define a set of common parameter names that may be used to initialize common media object types for that profile. This can significantly increase interoperability of user agents and media rendering libraries.

The supported uses of the type and valuetype attributes on the param element must be specified by the integrating profile. If a profile does not specify this, the type and valuetype attributes will be ignored in that profile.

4.6 SMIL MediaRenderAttributes Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaRenderAttributes Module definition. Languages implementing elements and attributes found in the MediaRenderAttributes module must implement all elements and attributes defined below, as well as BasicMedia.

4.6.1 Elements

This module does not define any elements.

4.6.2 Element Rendering Attributes for All Media Objects

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may have the attributes and attribute extensions defined below.

erase
Controls the behavior of the media object after the effects of any timing are complete. For example, when SMIL Timing is applied to a media element, erase controls the display of the media when the active duration of the element and when the freeze period defined by the fill attribute is complete (see SMIL Timing and Synchronization module). If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

whenDone (default)
When this is specified (or implied) the media removal occurs at the end of any applied timing.
never
When this value is specified, the last state of the media is kept displayed until the display area is reused (or if the display area is already being used by another media object). Any profile that integrates this element must define what is meant by "display area" and further define the interaction. Intrinsic hyperlinks (e.g., Flash, HTML) and explicit hyperlinks (e.g., area, a) stay active as long as the hyperlink is displayed. If timing is re-applied to an element, the effect of the erase=never is cleared. For example, when an element is restarted according to the SMIL Timing and Synchronization module, the element is cleared immediately before it restarts.

Example:

This section is informative.

<par>
  <seq>
    <par>
      <img src="image1.jpg" region="foo1" fill="freeze" erase="never" .../>
      <audio src="audio1.au"/>        
    </par>

    <par>
      <img src="image2.jpg" region="foo2" fill="freeze" erase="never" .../>
      <audio src="audio2.au"/>        
    </par>
     ...
    <par>
      <img src="imageN.jpg" region="fooN" fill="freeze" erase="never" .../>
      <audio src="audioN.au"/>        
    </par>
  </seq>
</par>

In this example, each image is successively displayed and remains displayed until the end of the presentation.

mediaRepeat
Used to strip the intrinsic repeat value of the underlying media object. The interpretation of this attribute is specific to the media type of the media object, and is only applicable to those media types for which there is a definition of a repeat value found in the media type format specification. Media type viewers used in SMIL implementations should expose an interface for controlling the repeat value of the media for this attribute to be applied. For all media types where there is an expectation of interoperability between SMIL implementations, there should be a formal specification of the exact repeat value to which the mediaRepeat attribute applies. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

strip
Strip the intrinsic repeat value of the media object.
preserve (default)
Leave the intrinsic repeat value of the media object intact.

As an example of how this would be used, many animated GIFs intrinsically repeat indefinitely. The application of mediaRepeat= "strip" allows an author to remove the intrinsic repeat behavior of an animated GIF on a per-reference basis, causing the animation to display only once, regardless of the repeat value embedded in the GIF.

When mediaRepeat is used in conjunction with SMIL Timing Module attributes, this attribute is applied first, so that the repeat behavior can then be controlled with the SMIL Timing Module attributes such as repeatCount and repeatDur.

sensitivity
Used to provide author control over the sensitivity of media to user interface selection events, such as the SMIL 2.1 activateEvent, and hyperlink activation. If the media is sensitive at the event location, it captures the event, and will not pass the event through to underlying media objects. If not, it allows the event to be passed through to any media objects lower in the display hierarchy. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.

Values:

opaque
The media is sensitive to user interface selection events over the entire area of the media. This is the default.
transparent
The media is not sensitive to user interface selection events over the entire area of the media. Any user interface selection events will be "passed through" to any underlying media.
percentage-value
The media sensitivity to user interface selection events is dependent upon the opacity of the media at the location of the event (the alpha channel value). If rendered media supports an alpha channel and the opacity of the media is less than the given percentage value at the event location, the behavior will be transparent as specified above. Otherwise the behavior will be as opaque. Valid values are non-negative CSS2 percentage values.

4.6.3 Integration Requirements

Any profile that supports the erase attribute must define what is meant by "display area" and further define the interaction. See the definition of erase for more details.

4.7 SMIL MediaOpacity Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaOpacity Module definition. Languages implementing elements and attributes found in the MediaOpacity module must implement all elements and attributes defined below, as well as BasicMedia.

4.7.1 Elements

This module does not define any elements.

4.7.2 Element Attributes for All Media Objects

In addition to the element attributes defined in BasicMedia, media object elements and layout regions may have the attributes and attribute extensions defined below.

chromaKey
This attribute defines the color to be used for chroma key opacity manipulation. It accepts a single CSS2 color value. If media objects or implementations cannot support manipulation of the chroma key value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
chromaKeyOpacity
This attribute defines the opacity of the chroma key value defined with the chromaKey attribute. It accepts a percentage value in the range 0-100% or a number in the range 0.0-1.0, with 100% or 1.0 meaning fully opaque. If a chroma key color is defined, the default value is 0% (fully transparent). If no chroma key color is defined or if implementations cannot support manipulation of the media opacity value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
chromaKeyTolerance
This attribute defines a color value that specifies a tolerance value that is added and subtracted from the effective chroma key. If a chroma key color was defined, the default value of this attribute is #000000. If no chroma key color was defined or if implementations cannot support manipulation of the chroma key value, this attribute is ignored. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region.
mediaOpacity
This attribute defines the opacity of the media object. It accepts a percentage value in the range 0-100% or a number in the range 0.0-1.0, with 100% or 1.0 meaning fully opaque. If implementations cannot support manipulation of the media opacity value, this attribute is ignored. The default value of this attribute is 100%. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all content displayed within that region. The media opacity manipulation does not apply to a background color for a media object, if such a color is defined. The background color opacity is manipulated using the mediaBackgroundOpacity attribute.
mediaBackgroundOpacity
This attribute defines the background color opacity of the media object for media objects that explicitly define a media background color. It accepts a percentage value in the range 0-100% or a number in the range 0.0-1.0, with 100% or 1.0 meaning fully opaque. If either media objects or implementations cannot support manipulation of the media background color opacity, this attribute is ignored. The default value of this attribute is 100%. If this attribute is defined on a SMIL layout region definition, it specifies a default value for all media background opacity displayed within that region.

This section is informative.

The attributes in this module allow the opacity (that is, the degree to which a media object is transparent) to be defined. Opacity may be controlled in several ways, depending on the type of media being used. For unstructured media (that is, media that does not contain an explicitly-defined background color), the chromaKey attribute may be used to identify a particular color that will serve as the background color for purposes of opacity manipulation. If a chromaKey is used, the chromaKeyOpacity attribute may specify the degree of transparency desired. Since the color used to define a background may not be exactly preserved within a media object, the chromaKeyTolerance attribute allows a tolerance range to be defined for the chroma key color.

Some media objects, such as RealText, smilText, GIF, PNG, and Flash, define an explicit background color. In these cases, the specification of the opacity of that color can be done using the mediaBackgroundOpacity attribute. In these cases, only the defined color is manipulated.

In addition to specifying the transparency level of a particular background color, SMIL also allows the specification of the transparency level of a total media object. This is accomplished using the mediaOpacity attribute.

Note that SMIL layout also defines the backgroundOpacity attribute to control the transparency of a layout region.

4.7.3 Integration Requirements

This module does not introduce any special integration constraints.

4.8 SMIL MediaClipping Module

This section is normative.

This section defines the attributes that make up the SMIL MediaClipping Module definition. Languages implementing the attributes found in the MediaClipping module must implement the attributes defined below, as well as BasicMedia.

4.8.1 MediaClipping Attributes

clipBegin (clip-begin)
The clipBegin attribute specifies the beginning of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media.
Values in the clipBegin attribute have the following syntax:
Clip-value-MediaClipping ::= [ Metric "=" ] ( Clock-val | Smpte-val )
Metric            ::= Smpte-type | "npt" 
Smpte-type        ::= "smpte" | "smpte-30-drop" | "smpte-25"
Smpte-val         ::= Hours ":" Minutes ":" Seconds 
                      [ ":" Frames [ "." Subframes ]]
Hours             ::= DIGIT+
Minutes           ::= DIGIT DIGIT /* range from 00 to 59 */
Seconds           ::= DIGIT DIGIT /* range from 00 to 59 */

Frames            ::= DIGIT DIGIT /* smpte range = 00-29, smpte-30-drop range = 00-29, smpte-25 range = 00-24 */
Subframes         ::= DIGIT DIGIT /* smpte range = 00-01, smpte-30-drop range = 00-01, smpte-25 range = 00-01 */
DIGIT             ::= [0-9]
      

The value of this attribute consists of a metric specifier, followed by a time value whose syntax and semantics depend on the metric specifier. The following formats are allowed:

SMPTE Timestamp
SMPTE time codes [SMPTE] may be used for frame-level access accuracy. The metric specifier may have the following values:
smpte
smpte-30-drop
These values indicate the use of the "SMPTE 30 drop" format (approximately 29.97 frames per second), as defined in the SMPTE specification (also referred to as "NTSC drop frame"). The "frames" field in the time value may assume the values 0 through 29. The difference between 30 and 29.97 frames per second is handled by dropping the first two frame indices (values 00 and 01) of every minute, except every tenth minute.
smpte-25
The "frames" field in the time specification may assume the values 0 through 24. This corresponds to the PAL standard as noted in [SMPTE]

The time value has the format hours:minutes:seconds:frames.subframes. If the subframe value is zero, it may be omitted. Subframes are measured in one-hundredths of a frame.
Examples:
clipBegin="smpte=10:12:33"

This section is informative.

The introduction of subframe notation in SMIL 2.1 introduced an inconsistency with SMIL 1.0. As of this draft, SMIL 3.0 has deprecated the subframe notation.

Normal Play Time
Normal Play Time expresses time in terms of SMIL clock values. The metric specifier is "npt", and the syntax of the time value is identical to the syntax of SMIL clock values.
Examples:
clipBegin="npt=123.45s"
clipBegin="npt=12:05:35.3
"
Marker
Not defined in this module. See clipBegin Media Marker attribute extension in the MediaClipMarkers module.

If no metric specifier is given, then a default of "npt=" is presumed.

When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

clipBegin may also be expressed as clip-begin for compatibility with SMIL 1.0. Software supporting the SMIL 2.1 Language Profile must be able to handle both clipBegin and clip-begin, whereas software supporting only the SMIL MediaClipping module only needs to support clipBegin. If an element contains both a clipBegin and a clip-begin attribute, then clipBegin takes precedence over clip-begin.

Example:

This section is informative.

<audio src="radio.wav" clip-begin="5s" clipBegin="10s" />

The clip begins at second 10 of the audio, and not at second 5, since the clip-begin attribute is ignored. A strict SMIL 1.0 implementation will start the clip at second 5 of the audio, since the clipBegin attribute will not be recognized by that implementation. See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

clipEnd (clip-end)
The clipEnd attribute specifies the end of a sub-clip of a continuous media object as offset from the start of the media object. This offset is measured in normal media playback time from the beginning of the media. It uses the same attribute value syntax as the clipBegin attribute.
If the value of the clipEnd attribute exceeds the duration of the media object, the value is ignored, and the clip end is set equal to the effective end of the media object. clipEnd may also be expressed as clip-end for compatibility with SMIL 1.0. Software supporting the SMIL 2.1 Language Profile must be able to handle both clipEnd and clip-end, whereas software supporting only the SMIL media object module only needs to support clipEnd. If an element contains both a clipEnd and a clip-end attribute, then clipEnd takes precedence over clip-end. When used in conjunction with the timing attributes from the SMIL Timing Module, this attribute is applied before any SMIL Timing Module attributes.

See Changes to SMIL 1.0 Media Object Attributes for more discussion on this topic.

4.9 SMIL MediaClipMarkers Module

This section is normative.

This section defines the attribute extensions that make up the SMIL MediaClipMarkers Module definition. Languages implementing elements and attributes found in the MediaClipMarkers module must implement all elements and attributes defined below, as well as BasicMedia and MediaClipping.

4.9.1 MediaClipMarkers Attribute Extensions

clipBegin Media Marker attribute extension
Used to define a clip using named time points in a media object, rather than using clock values or SMPTE values. The metric specifier is "marker", and the marker value is a IRI (see [IRI] ). The IRI is relative to the src attribute, rather than to the document root or the XML base of the SMIL document.

Clip-value-MediaClipMarkers ::= Clip-value-MediaClipping |
                      "marker" "=" URI-reference
   /* "URI-reference" is defined in  [URI]  */

Example: Assume that a recorded radio transmission consists of a sequence of songs, which are separated by announcements by a disk jockey. The audio format supports marked time points, and the begin of each song or announcement with number X is marked as songX or djX respectively. To extract the first song using the "marker" metric, the following audio media element may be used:

<audio clipBegin="marker=#song1" clipEnd="marker=#dj1" />
clipEnd Media Marker attribute extension
clipEnd media markers use the same attribute value syntax as the clipBegin media marker extension media marker attribute extension. For the complete description, see clipBegin media marker extension.

4.10 SMIL BrushMedia Module

This section is normative.

This section defines the elements and attributes that make up the SMIL BrushMedia Module definition. Languages implementing elements and attributes found in the BrushMedia module must implement all elements and attributes defined below.

4.10.1 The brush element

The brush element is a lightweight media object element which allows an author to paint a solid color in place of a media object. Attributes associated with media objects may also be applied to brush element. (A specific profile will determine the attribute set applied to this element.)

Attribute definitions
color
The use and definition of this attribute are identical to the "background-color" property in the CSS2 specification.

4.10.2 Integration Requirements

Profiles including the BrushMedia module must provide semantics for using a color attribute value of inherit on the brush element. Because inherit doesn't make sense in all contexts, the value of inherit is prohibited on the color attribute of the brush element for profiles that do not otherwise define these semantics.

4.11 SMIL MediaAccessibility Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaAccessibility Module definition. Languages implementing elements and attributes found in the MediaAccessibility module must implement all elements and attributes defined below, as well as MediaDescription.

4.11.1 MediaAccessibility Attributes

Attribute definitions
alt
For user agents that cannot display a particular media object, this attribute specifies alternate text. alt may be displayed in addition to the media, or instead of media when the user has configured the user agent to not display the given media type.

It is strongly recommended that all media object elements have an "alt" attribute with a brief, meaningful description. Authoring tools should ensure that no element may be introduced into a SMIL document without this attribute.

The value of this attribute is a CDATA text string.

longdesc
This attribute specifies a IRI link ([IRI] ) to a long description of a media object. This description should supplement the short description provided using the alt attribute or the abstract attribute. When the media object has associated hyperlinked content, this attribute should provide information about the hyperlinked content.

readIndex
This attribute specifies the position of the current element in the order in which longdesc, title and alt text are read aloud by assistive devices (such as screen readers) for the current document. User agents should ignore leading zeros. The default value is 0.

Elements that contain alt, title or longdesc attributes are read by the assistive technology according to the following rules:

  • Those elements that assign a positive value to the readindex attribute are read out first. Navigation proceeds from the element with the lowest readindex value to the element with the highest value. Values need not be sequential nor must they begin with any particular value. Elements that have identical readindex values should be read out in the order they appear in the character stream of the document.
  • Those elements that assign it a value of "0" are read out in the order they appear in the character stream of the document.
  • Elements in a switch statement that have test-attributes which evaluate to "false" are not read out.

Example

This section is informative.

<par>
  <video xml:id="carvideo" src="car.rm" region="videoregion" title="Car video"
         alt="Illustration of relativistic time dilation and length 
              contraction." 
         longdesc="carvideodesc.html" readIndex="3"/>
  <audio xml:id="caraudio" src="caraudio.rm" region="videoregion" 
         title="Car presentation voiceover" begin="bar.begin"/>
  <animation xml:id="cardiagram" src="car.svg" region="animregion" 
         title="Diagram of the car" readIndex="2"/>
  <img xml:id="scvad" src="scv.png" region="videoregion" 
         title="Advertisement for Sugar Coated Vegetables"
         readIndex="1"/>
</par>

In this example, an assistive device that is presenting titles should present the "scvad" element title first (having the lowest readIndex value of "1"), followed by the "cardiagram" title, followed by the "carvideo" element title, and finally present the "caraudio" element title (having an implicit readIndex value of "0").

Note that not all examples in this specification use all media accessibility attributes because the purpose of the sample code is to illustrate specific language features.

4.12 SMIL MediaDescription Module

This section is normative.

This section defines the elements and attributes that make up the SMIL MediaDescription Module definition. Languages implementing elements and attributes found in the MediaDescription module must implement all elements and attributes defined below.

4.12.1 MediaDescription Attributes

Attribute definitions
abstract
A brief description of the content contained in the element. Unlike alt, this attribute is generally not displayed as alternate content to the media object. It is typically used as a description when table of contents information is generated from a SMIL presentation, and typically contains more information than would be advisable to put in an alt attribute.

This attribute is deprecated in favor of using appropriate SMIL metadata markup in RDF. For example, this attribute maps well to the "description" attribute as defined by the Dublin Core Metadata Initiative [DC] .

author
The name of the author of the content contained in the element.

The value of this attribute is a CDATA text string.

copyright
The copyright notice of the content contained in the element.

The value of this attribute is a CDATA text string.

title
The title attribute as defined in the SMIL Structure module. It is strongly recommended that all media object elements have a title attribute with a brief, meaningful description. Authoring tools should ensure that no element may be introduced into a SMIL document without this attribute.
xml:lang
Used to identify the natural or formal language for the element. For a complete description, see section 2.12 Language Identification of [XML11].

xml:lang differs from the systemLanguage test attribute in one important respect. xml:lang provides information about the content's language independent of what implementations do with the information, whereas systemLanguage is a test attribute with specific associated behavior (see systemLanguage in SMIL Content Control Module for details)

This section is informative.

SMIL 3.0 also supports the use of the element within the MetaInformation Module to supply additional or alternative forms of metainformation for any media object.

4.13 MediaPanZoom Module

This section is normative.

4.13.1 Overview

This section is informative.

The SMIL MediaPanZoom module integrates the functionality of the SVG viewBox attribute and adapts it for use within the SMIL media framework. The SMIL panZoom attribute allows a SMIL author to define a two-dimensional extent over the visible surface of a media object and to subsequently project the contents within the panZoom area into a SMIL presentation.

Most of SMIL's layout elements and attributes provide the ability to define and manage a two-dimensional rendering space. This space is defined relative to a root-layout (or topLayout) specification. All of the coordinate and size specifications are in terms of the coordinate space defined for the layout root. In contrast, the panZoom attribute allows users to define an area in terms of the coordinate space used by the media object that is associated with the panZoom area. The panZoom area may be smaller, equal to, or larger than the related media object.

The following illustration shows three views of a 300x200 pixel image. In the left view, a panZoom area is shown that is the same size as the media object; in the middle view, a panZoom area is defined that covers the middle part of the image only; in the right view, a panZoom area is illustrated that is positioned (in both dimensions) partially outside the media object. Note that while this illustration shows the panZoom area projected onto an image, similar illustrations could be defined for videos or text objects, or any other object that may be mapped to a particular media bounding box.

Picture showing a base image and three panZoom area examples

Once a portion of a media object's visible area is defined with a panZoom area, the portion within the panZoom area is processed further as if it defined the full native view of the media object. The content within the panZoom area is projected into a region in a manner that is dependent on the region element associated with that object, including any scaling dictated by the fit attribute or (if appropriate), sub-region positioning and alignment directives.

If the region and the panZoom area have the same aspect ratios, then the panZoom area will, by default, fill the entire region. If the effective pixel dimensions of the region are larger than that of the panZoom area, the effect will be an enlargement of the media content. If the effective pixel dimensions of the window are smaller than that of the panZoom area, the effect will be a reduction in size of the media object. Other effects may be obtained by manipulating the fit attribute of the region.

If supported by the profile implementing this module, a dynamic pan-and-zoom effect may be obtained by applying standard SMIL animation primitives to the dimensions of the panZoom area. A pan effect may be obtained by varying the X and Y positioning values, and a zoom effect may be obtained by changing the size dimensions of the panZoom area. Examples of these effects are given later in this module description. Given the nature of independently animating collections of attribute values, care should be taken when specifying animation behavior.

If a panZoom area extends past the viewable extents of a media object (such as in the rightmost illustration, above), then the effective contents of these extended areas will be transparent.

4.13.2 Elements and Attributes for the MediaPanZoom Module

This module does not define any new elements. It provides extensions to the ref element (and its synonyms), and to the region element.

The ref Element

The panZoom attribute is added to media object references.

Element attributes
panZoom
This attribute specifies a rectangular area in media coordinates that defines the portion of a media object that is to be used within a SMIL presentation. The panZoom attribute defines an ordered list of four values, separated by a comma:
left
A value (using CSS2 pixel or percentage values) that defines the minimum X coordinate of a rectangle in media space that serves as the X origin of the panZoom area. If pixel notation is used, the 'px' suffix may be omitted. An effective value of '0px' represents the left edge of the media object.
top
A value (using CSS2 pixel or percentage values) that defines the minimum Y coordinate of a rectangle in media space that serves as the Y origin of the panZoom area. If pixel notation is used, the 'px' suffix may be omitted. A value of '0' represents the top edge of the media object.
width
A non-negative length value (using CSS2 pixel or non-negative percentage values) that defines the horizontal dimension of the panZoom area. If pixel notation is used, the 'px' suffix may be omitted. A negative value is an error. The default value of width is set to the intrinsic width of the associated media object.
height
A non-negative length value (using CSS2 pixel or non-negative percentage values) that defines the vertical dimension of the panZoom area. If pixel notation is used, the 'px' suffix may be omitted. A negative value is an error. The default value of set to the intrinsic height of the associated media object.
The default panZoom area behavior is to select the entire visual space of the media object; this is equivalent to panZoom="0, 0, 100%, 100%".

The panZoom area is processed on the media object before any other SMIL layout processing occurs. The actual visual rendering of the content resulting from the processed panZoom area will be determined by, among other factors: the size of the target region, the application of sub-region positioning in that region (if supported by the profile), the value of the fit attribute on the region, and the effect of SMIL alignment attributes (if supported by the profile).

This section is informative.

If the profile integrating the panZoom element allows each of the attribute values to be animated, care should be taken to choose an animation calculation mode that will yield predictable results (such as using a linear mode). The animation of mixed percentage/pixel values for height and width is not recommended.

Note that the specification of negative values for left and top is not an error; this allows placing (a portion of) the panZoom area outside of the media.

Element content

The SMIL MediaPanZoom module does not extend the content model for the ref element integrating these attributes.

The region Element

The panZoom attribute is added to regions definitions.

Element attributes
panZoom
This attribute is identical in definition to the panZoom attribute defined for the ref element in this section, with the exception that it defines a default panZoom area that is applied to all media rendered in the associated region. All other aspects of panZoom area processing are the same as with the ref element, except that the values defined for the panZoom area on a region may be overridden by a panZoom area specification on the ref element.
Element content

The SMIL MediaPanZoom module does not extend the content model for the region element integrating these attributes.

Attribute Examples

This section is informative.

Assume the following SMIL example:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region xml:id="I" top="0" left="0" height="200" width="300"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref xml:id="R1" src="table.jpg" panZoom="0,0,300,200" dur="5s" region="I" />
      <ref xml:id="R2" src="table.jpg" panZoom="80,50,160,125" dur="5s" region="I" fit="meet"/>
      <ref xml:id="R3" src="table.jpg" panZoom="80,50,160,125" dur="5s" region="I" fit="meetBest"/>
      <ref xml:id="R4" src="table.jpg" panZoom="240,120,85,110" dur="5s" region="I" fit="meet"/>
    </seq>
  </body>
</smil>

In this example, a single region is defined that is used to display four instances of the same image. Each media reference within the sequence S contains a different panZoom area definition, each of which will result in the following behavior:

  1. The media reference R1 defines a panZoom area that encompasses the entire media object space; the full image will be shown in region I, as is shown in the following image:
    A panZoom area projection that is the same size as the target region.
    Note that the origin of the image is aligned with the origin of the media object, at the top-left of the region.
  2. The media reference R2 defines a panZoom area that encompasses the center portion of the media object space. The projection of the media into region I will result in a zoom into the source image, as is shown in the following image:
    A panZoom area projection that is smaller than the target region, resulting in a zoom effect.

    Note that the origin of the sub-image defined by the panZoom area is placed at the origin of the top-left of the region. Note also that the value of the fit attribute determines that the image is scaled (while maintaining the aspect ratio), resulting in the zoom effect.

  3. The media reference R3 defines a panZoom area that is the same as in reference R2; the difference in this example is that the value of the fit attribute does not permit enlargement of the source image into the region. As a result, the image is placed at top-left in an unscaled rendering:
    A panZoom area projection that is smaller than the target region, but with a fit=
  4. The media reference R4 defines a panZoom area that extends beyond the boundaries of the media object. When it is projected into the region I with a fit value that scales the image with preserved aspect ratio, the entire extent of the panZoom area is scaled: the areas that extend beyond the image content are rendered as (scaled) transparent content:
    A panZoom area projection that extends beyond the right/bottom edge of the image -- the extended part of the box will be transparent.

All of the previous examples illustrate how a panZoom area operates on a media object that contains a media-defined viewable extent. The panZoom attribute may also be applied to visual objects that do not have predefined extents. Consider the following example, in which an unstructured text object is placed in a region:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region xml:id="T" top="0" left="0" height="50" width="300"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref xml:id="R0" src="short_story.txt" panZoom="0,10,50,200" dur="10s" region="T" />
    </seq>
  </body>
</smil>

In this example, a single region is defined that is used to display a undimensioned text object. In SMIL 3.0, the text object would first be rendered to an off-screen bitmap based on the default settings for the media object (font, font size, font color) and then a panZoom area of the defined size would be overlaid on this text representation. This facility is especially useful when combined with SMIL Animation, as discussed in the next example.

The ability to define a panZoom area, when combined with SMIL animation primitives, provides a simple mechanism for doing pan/zoom animations over a visual object. (These pan/zoom animations are often called 'Ken Burns' animations.) The following example illustrates how a pan window may be positioned and moved over an image area:

<smil ...>
  <head>
  ...
    <layout>
      <root-layout height="200" width="300" backgroundColor="red" />
      <region xml:id="B" top="0" left="0" height="50" width="75"  backgroundColor="blue" />
    </layout>
  </head>
  <body>
    <seq> 
      <ref xml:id="R0" src="table_233x150.jpg" panZoom="0,0,50,75" dur="20s" region="B" fit="meet" >
         <animate attributeName="panZoom" 
                     values="25,20,50,75; 45,55,50,75; 140,40,50,75; 35,0,100,150; 0,0,100,150" 
                     dur="20s" />
      </ref>
      ...
    </seq>
  </body>
</smil>

In this example, an image with intrinsic size of 233x150 pixels is rendered into a region of size 50x75. An initial panZoom area is defined that displays a 50x75 portion of that image, positioned in its top-left corner. During the following 20 seconds, the panZoom area is moved across the image according to the behavior of the animate element; the panZoom area changes are scheduled at equal points across the animation timeline (in this case, every 5 seconds). During the final animation, the panZoom area is extended to implement a zoom-out across the entire image. An illustration of the rendering results is shown below:


A panZoom area projection and a set of animations that move the panZoom area across the source image.

4.13.3 MediaPanZoom Module Events

This module does not define any SMIL events.

4.13.4 SMIL MediaPanZoom Implementation and Integration

Implementation Details

The MediaPanZoom module allows individual media object references to override the default values for certain attributes. In all cases, the attributes will apply only to the (sub-)region referenced by the media object. Changes will not propagate to child sub-regions or to parent regions.

Integration Requirements

The functionality in this module builds on top of the functionality in the Media module, which is a required prerequisite for inclusion of the MediaPanZoom module.

Differences with the SVG viewBox Attribute

The functionality in this module builds on the viewBox definition of SVG. Unlike SVG, the SMIL panZoom attribute defines a logical sub-image that contains only content within the panZoom area; SVG uses the viewBox to define a minimum viewing dimension for content, but allowing content outside the viewBox to be displayed in the region.

The MediaPanZoom module does not define a preserveAspectRatio attribute, since this functionality is already provided by the SMIL fit and registration/alignment attributes.

4.13.5 Document Type Definition (DTD) for the MediaPanZoom Module

See the full DTD for the SMIL Layout modules.

4.14 Appendices

This section is informative.

4.14.1 Appendix A: Changes to SMIL 1.0 Media Object Attributes

clipBegin, clipEnd, clip-begin, clip-end

With regards to the clipBegin/clip-begin and clipEnd/clip-end elements, SMIL 3.0 defines the following changes to the syntax defined in SMIL 1.0:

Handling of clipBegin/clipEnd syntax in SMIL 1.0 software

Using attribute names with hyphens such as clip-begin and clip-end is problematic when using a scripting language and the DOM to manipulate these attributes. Therefore, this specification adds the attribute names clipBegin and clipEnd as an equivalent alternative to the SMIL 1.0 clip-begin and clip-end attributes. The attribute names with hyphens are deprecated.

Authors may use two approaches for writing SMIL 3.0 presentations that use the new clipping syntax and functionality ("marker", default metric) defined in this specification, but can still can be handled by SMIL 1.0 software. First, authors may use non-hyphenated versions of the new attributes that use the new functionality, and add SMIL 1.0 conformant clipping attributes later in the text.

Example:

<audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" 
       clip-begin="npt=0s" clip-end="npt=3:50" />

SMIL 1.0 players implementing the recommended extensibility rules of SMIL 1.0 [SMIL10] will ignore the clip attributes using the new functionality, since they are not part of SMIL 1.0. SMIL 3.0 players, in contrast, will ignore the clip attributes using SMIL 1.0 syntax, because the SMIL 3.0 syntax takes precedence over the SMIL 1.0 syntax.

The second approach is to use the following steps:

  1. Add a "system-required" test attribute to media object elements using the new functionality. The value of the "system-required" attribute would correspond to a namespace prefix whose namespace IR ([IRI] ) points to a SMIL specification which integrates the new functionality.
  2. Add an alternative version of the media object element that conforms to SMIL 1.0
  3. Include these two elements in a "switch" element

Example:

<smil xmlns="http://www.w3.org/ns/SMIL" version="3.0" baseProfile="Language">
...
<switch>
  <audio src="radio.wav" clipBegin="marker=song1" clipEnd="marker=moderator1" 
   system-required="smil2" />
  <audio src="radio.wav" clip-begin="npt=0s" clip-end="npt=3:50" />
</switch>

Additional Accessibility Attributes

readIndex
Allows explicit ordering for controlling assistive technology.

Additional Advanced Media Attributes

mediaRepeat
The mediaRepeat attribute was added to provide better timing control over media with intrinsic repeat behavior (such as animated GIFs).
erase
Provides a way for visual media to remain visible throughout the duration of a presentation by overriding the default erase behavior.

previous   next   contents