video
elementaudio
elementsource
elementtrack
elementThis is a work in progress! For the latest updates from the HTML WG, possibly including important bug fixes, please look at the editor's draft instead.
video
elementISSUE-9 (video-accessibility) blocks progress to Last Call
controls
attribute: Interactive content.src
attribute:
zero or more track
elements, then
transparent, but with no media element descendants.src
attribute: one or more source
elements, then
zero or more track
elements, then
transparent, but with no media element descendants.src
poster
preload
autoplay
loop
audio
controls
width
height
interface HTMLVideoElement : HTMLMediaElement { attribute unsigned long width; attribute unsigned long height; readonly attribute unsigned long videoWidth; readonly attribute unsigned long videoHeight; attribute DOMString poster; [PutForwards=value attribute DOMSettableTokenList audio; };
A video
element is used for playing videos or
movies.
Content may be provided inside the video
element. User agents should not show this content
to the user; it is intended for older Web browsers which do
not support video
, so that legacy video plugins can be
tried, or to show text to the users of these older browsers informing
them of how to access the video contents.
In particular, this content is not intended to address accessibility concerns. To make video content accessible to the blind, deaf, and those with other physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as caption or subtitle tracks, audio description tracks, or sign-language overlays) into their media streams.
The video
element is a media element
whose media data is ostensibly video data, possibly
with associated audio data.
The src
, preload
, autoplay
, loop
, and controls
attributes are the attributes common to all media
elements. The audio
attribute controls the audio
channel.
The poster
attribute gives the address of an image file that the user agent can
show while no video data is available. The attribute, if present,
must contain a valid non-empty URL potentially surrounded by
spaces.
If the specified resource is to be used, then, when the element
is created or when the poster
attribute is set, changed, or removed, the user agent must run the
following steps to determine the element's poster
frame:
If there is an existing instance of this algorithm running
for this video
element, abort that instance of this
algorithm without changing the poster frame.
If the poster
attribute's value is the empty string, then there is no
poster frame; abort these steps.
Resolve the poster
attribute's value relative
to the element. If this fails, then there is no poster
frame; abort these steps.
Fetch the resulting absolute URL,
from the element's Document
's origin.
This must delay the load event of the element's
document.
If an image is thus obtained, the poster frame is that image. Otherwise, there is no poster frame.
The image given by the poster
attribute, the poster
frame, is intended to be a representative frame of the video
(typically one of the first non-blank frames) that gives the user an
idea of what the video is like.
When no video data is available (the element's readyState
attribute is either
HAVE_NOTHING
, or HAVE_METADATA
but no video
data has yet been obtained at all), the video
element
represents either the poster frame, or
nothing.
When a video
element is paused and the current playback position is the first
frame of video, the element represents either the frame
of video corresponding to the current playback position or the poster
frame, at the discretion of the user agent.
Notwithstanding the above, the poster frame should be preferred over nothing, but the poster frame should not be shown again after a frame of video has been shown.
When a video
element is paused at any other position, the
element represents the frame of video corresponding to
the current playback
position, or, if that is not yet available (e.g. because the
video is seeking or buffering), the last frame of the video to have
been rendered.
When a video
element is potentially
playing, it represents the frame of video at the
continuously increasing "current" position. When the current playback
position changes such that the last frame rendered is no
longer the frame corresponding to the current playback
position in the video, the new frame must be
rendered. Similarly, any audio associated with the video must, if
played, be played synchronized with the current playback
position, at the specified volume with the specified mute state.
When a video
element is neither potentially
playing nor paused
(e.g. when seeking or stalled), the element represents
the last frame of the video to have been rendered.
Which frame in a video stream corresponds to a particular playback position is defined by the video stream's format.
The video
element also represents any
text track cues whose
text track cue active flag is set and whose
text track is in the showing or showing by default modes.
In addition to the above, the user agent may provide messages to the user (such as "buffering", "no video loaded", "error", or more detailed information) by overlaying text or icons on the video or other areas of the element's playback area, or in another appropriate manner.
User agents that cannot render the video may instead make the element represent a link to an external video playback utility or to the video data itself.
videoWidth
videoHeight
These attributes return the intrinsic dimensions of the video, or zero if the dimensions are not known.
The intrinsic width and intrinsic height of the media resource are the dimensions of the resource in CSS pixels after taking into account the resource's dimensions, aspect ratio, clean aperture, resolution, and so forth, as defined for the format used by the resource. If an anamorphic format does not define how to apply the aspect ratio to the video data's dimensions to obtain the "correct" dimensions, then the user agent must apply the ratio by increasing one dimension and leaving the other unchanged.
The videoWidth
IDL
attribute must return the intrinsic width of the
video in CSS pixels. The videoHeight
IDL
attribute must return the intrinsic height of
the video in CSS pixels. If the element's readyState
attribute is HAVE_NOTHING
, then the
attributes must return 0.
The video
element supports dimension
attributes.
Video content should be rendered inside the element's playback area such that the video content is shown centered in the playback area at the largest possible size that fits completely within it, with the video content's aspect ratio being preserved. Thus, if the aspect ratio of the playback area does not match the aspect ratio of the video, the video will be shown letterboxed or pillarboxed. Areas of the element's playback area that do not contain the video represent nothing.
The intrinsic width of a video
element's playback
area is the intrinsic
width of the video resource, if that is available; otherwise
it is the intrinsic width of the poster frame, if that
is available; otherwise it is 300 CSS pixels.
The intrinsic height of a video
element's playback
area is the intrinsic
height of the video resource, if that is available; otherwise
it is the intrinsic height of the poster frame, if that
is available; otherwise it is 150 CSS pixels.
User agents should provide controls to enable or disable the display of closed captions, audio description tracks, and other additional data associated with the video stream, though such features should, again, not interfere with the page's normal rendering.
User agents may allow users to view the video content in manners
more suitable to the user (e.g. full-screen or in an independent
resizable window). As for the other user interface features,
controls to enable this should not interfere with the page's normal
rendering unless the user agent is exposing a user interface. In such an
independent context, however, user agents may make full user
interfaces visible, with, e.g., play, pause, seeking, and volume
controls, even if the controls
attribute is absent.
User agents may allow video playback to affect system features that could interfere with the user's experience; for example, user agents could disable screensavers while video playback is in progress.
The poster
IDL
attribute must reflect the poster
content attribute.
The audio
IDL
attribute must reflect the audio
content attribute.
This example shows how to detect when a video has failed to play correctly:
<script> function failed(e) { // video playback failed - show a message saying why switch (e.target.error.code) { case e.target.error.MEDIA_ERR_ABORTED: alert('You aborted the video playback.'); break; case e.target.error.MEDIA_ERR_NETWORK: alert('A network error caused the video download to fail part-way.'); break; case e.target.error.MEDIA_ERR_DECODE: alert('The video playback was aborted due to a corruption problem or because the video used features your browser did not support.'); break; case e.target.error.MEDIA_ERR_SRC_NOT_SUPPORTED: alert('The video could not be loaded, either because the server or network failed or because the format is not supported.'); break; default: alert('An unknown error occurred.'); break; } } </script> <p><video src="tgif.vid" autoplay controls onerror="failed(event)"></video></p> <p><a href="tgif.vid">Download the video file</a>.</p>
audio
elementcontrols
attribute: Interactive content.src
attribute:
zero or more track
elements, then
transparent, but with no media element descendants.src
attribute: one or more source
elements, then
zero or more track
elements, then
transparent, but with no media element descendants.src
preload
autoplay
loop
controls
[NamedConstructor=Audio(), NamedConstructor=Audio(in DOMString src)] interface HTMLAudioElement : HTMLMediaElement {};
An audio
element represents a sound or
audio stream.
Content may be provided inside the audio
element. User agents should not show this content
to the user; it is intended for older Web browsers which do
not support audio
, so that legacy audio plugins can be
tried, or to show text to the users of these older browsers informing
them of how to access the audio contents.
In particular, this content is not intended to address accessibility concerns. To make audio content accessible to the deaf or to those with other physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as transcriptions) into their media streams.
The audio
element is a media element
whose media data is ostensibly audio data.
The src
, preload
, autoplay
, loop
, and controls
attributes are the attributes common to all media
elements.
When an audio
element is potentially
playing, it must have its audio data played synchronized with
the current playback position, at the specified volume with the specified mute state.
When an audio
element is not potentially
playing, audio must not play for the element.
Audio
( [ url ] )Returns a new audio
element, with the src
attribute set to the value
passed in the argument, if applicable.
Two constructors are provided for creating
HTMLAudioElement
objects (in addition to the factory
methods from DOM Core such as createElement()
): Audio()
and Audio(src)
. When invoked as constructors,
these must return a new HTMLAudioElement
object (a new
audio
element). The element must have its preload
attribute set to the
literal value "auto
". If the src argument is present, the object created must have
its src
content attribute set to
the provided value, and the user agent must invoke the object's
resource selection
algorithm before returning. The element's document must be
the active document of the browsing
context of the Window
object on which the
interface object of the invoked constructor is found.
source
elementtrack
elements.src
type
media
interface HTMLSourceElement : HTMLElement { attribute DOMString src; attribute DOMString type; attribute DOMString media; };
The source
element allows authors to specify
multiple alternative media
resources for media
elements. It does not represent anything on its own.
The src
attribute
gives the address of the media resource. The value must
be a valid non-empty URL potentially surrounded by
spaces. This attribute must be present.
Dynamically modifying a source
element
and its attribute when the element is already inserted in a
video
or audio
element will have no
effect. To change what is playing, either just use the src
attribute on the media
element directly, or call the load()
method on the media
element after manipulating the source
elements.
The type
attribute gives the type of the media resource, to help
the user agent determine if it can play this media
resource before fetching it. If specified, its value must be
a valid MIME type. The codecs
parameter, which certain MIME types define, might be necessary to
specify exactly how the resource is encoded. [RFC4281]
The following list shows some examples of how to use the codecs=
MIME parameter in the type
attribute.
<source src='video.mp4' type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'>
<source src='video.mp4' type='video/mp4; codecs="avc1.58A01E, mp4a.40.2"'>
<source src='video.mp4' type='video/mp4; codecs="avc1.4D401E, mp4a.40.2"'>
<source src='video.mp4' type='video/mp4; codecs="avc1.64001E, mp4a.40.2"'>
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.8, mp4a.40.2"'>
<source src='video.mp4' type='video/mp4; codecs="mp4v.20.240, mp4a.40.2"'>
<source src='video.3gp' type='video/3gpp; codecs="mp4v.20.8, samr"'>
<source src='video.ogv' type='video/ogg; codecs="theora, vorbis"'>
<source src='video.ogv' type='video/ogg; codecs="theora, speex"'>
<source src='audio.ogg' type='audio/ogg; codecs=vorbis'>
<source src='audio.spx' type='audio/ogg; codecs=speex'>
<source src='audio.oga' type='audio/ogg; codecs=flac'>
<source src='video.ogv' type='video/ogg; codecs="dirac, vorbis"'>
<source src='video.mkv' type='video/x-matroska; codecs="theora, vorbis"'>
The media
attribute gives the intended media type of the media
resource, to help the user agent determine if this
media resource is useful to the user before fetching
it. Its value must be a valid media query.
The default, if the media
attribute is omitted, is
"all
", meaning that by default the media
resource is suitable for all media.
If a source
element is inserted as a child of a
media element that has no src
attribute and whose networkState
has the value
NETWORK_EMPTY
, the user
agent must invoke the media element's resource selection
algorithm.
The IDL attributes src
, type
, and media
must
reflect the respective content attributes of the same
name.
If the author isn't sure if the user agents will all be able to
render the media resources provided, the author can listen to the
error
event on the last
source
element and trigger fallback behavior:
<script> function fallback(video) { // replace <video> with its contents while (video.hasChildNodes()) { if (video.firstChild instanceof HTMLSourceElement) video.removeChild(video.firstChild); else video.parentNode.insertBefore(video.firstChild, video); } video.parentNode.removeChild(video); } </script> <video controls autoplay> <source src='video.mp4' type='video/mp4; codecs="avc1.42E01E, mp4a.40.2"'> <source src='video.ogv' type='video/ogg; codecs="theora, vorbis"' onerror="fallback(parentNode)"> ... </video>
track
elementISSUE-9 (video-accessibility) blocks progress to Last Call
kind
src
srclang
label
default
interface HTMLTrackElement : HTMLElement { attribute DOMString kind; attribute DOMString src; attribute DOMString srclang; attribute DOMString label; attribute boolean default; readonly attribute TextTrack track; };
The track
element allows authors to specify explicit
external timed text tracks for media elements. It does not represent anything on its own.
The kind
attribute is an enumerated attribute. The following
table lists the keywords defined for this attribute. The keyword
given in the first cell of each row maps to the state given in the
second cell.
Keyword | State | Brief description |
---|---|---|
subtitles
| Subtitles | Transcription or translation of the dialogue, suitable for when the sound is available but not understood (e.g. because the user does not understand the language of the media resource's soundtrack). Displayed over the video. |
captions
| Captions | Transcription or translation of the dialogue, sound effects, relevant musical cues, and other relevant audio information, suitable for when the soundtrack is unavailable (e.g. because it is muted or because the user is deaf). Displayed over the video; labeled as appropriate for the hard-of-hearing. |
descriptions
| Descriptions | Textual descriptions of the video component of the media resource, intended for audio synthesis when the visual component is unavailable (e.g. because the user is interacting with the application without a screen while driving, or because the user is blind). Synthesized as separate audio track. |
chapters
| Chapters | Chapter titles, intended to be used for navigating the media resource. Displayed as an interactive list in the user agent's interface. |
metadata
| Metadata | Tracks intended for use from script. Not displayed by the user agent. |
The attribute may be omitted. The missing value default is the subtitles state.
The src
attribute
gives the address of the text track data. The value must be a
valid non-empty URL potentially surrounded by
spaces. This attribute must be present.
If the element has a src
attribute whose value is not the empty string and whose value, when
the attribute was set, could be successfully resolved relative to the element, then the element's
track URL is the resulting absolute
URL. Otherwise, the element's track URL is the
empty string.
The srclang
attribute gives the language of the text track data. The value must
be a valid BCP 47 language tag. This attribute must be present if
the element's kind
attribute is
in the subtitles
state. [BCP47]
If the element has a srclang
attribute whose value is
not the empty string, then the element's track language
is the value of the attribute. Otherwise, the element has no
track language.
The label
attribute gives a user-readable title for the track. This title is
used by user agents when listing subtitle, caption, and audio description tracks
in their user interface.
The value of the label
attribute, if the attribute is present, must not be the empty
string. Furthermore, there must not be two track
element children of the same media element whose kind
attributes are in the same
state, whose srclang
attributes are both missing or have values that represent the same
language, and whose label
attributes are again both missing or both have the same value.
If the element has a label
attribute whose value is not the empty string, then the element's
track label is the value of the attribute. Otherwise, the
element's track label is a user-agent defined string
(e.g. the string "untitled" in the user's locale, or a value
automatically generated from the other attributes).
The default
attribute, if specified, indicates that the track is to be enabled
if the user's preferences do not indicate that another track would
be more appropriate. There must not be more than one
track
element with the same parent node with the default
attribute specified.
track
Returns the TextTrack
object corresponding to the text track of the track
element.
The track
IDL
attribute must, on getting, return the track
element's
text track's corresponding TextTrack
object.
The src
, srclang
, label
, and default
IDL attributes
must reflect the respective content attributes of the
same name. The kind
IDL attribute must reflect the content attribute of the
same name, limited to only known values.
This video has subtitles in several languages:
<video src="brave.webm"> <track kind=subtitles src=brave.en.vtt srclang=en label="English"> <track kind=captions src=brave.en.vtt srclang=en label="English for the Hard of Hearing"> <track kind=subtitles src=brave.fr.vtt srclang=fr label="Français"> <track kind=subtitles src=brave.de.vtt srclang=de label="Deutsch"> </video>
Media elements
(audio
and video
, in this specification)
implement the following interface:
interface HTMLMediaElement : HTMLElement {
// error state
readonly attribute MediaError error;
// network state
attribute DOMString src;
readonly attribute DOMString currentSrc;
const unsigned short NETWORK_EMPTY = 0;
const unsigned short NETWORK_IDLE = 1;
const unsigned short NETWORK_LOADING = 2;
const unsigned short NETWORK_NO_SOURCE = 3;
readonly attribute unsigned short networkState;
attribute DOMString preload;
readonly attribute TimeRanges buffered;
void load();
DOMString canPlayType(in DOMString type);
// ready state
const unsigned short HAVE_NOTHING = 0;
const unsigned short HAVE_METADATA = 1;
const unsigned short HAVE_CURRENT_DATA = 2;
const unsigned short HAVE_FUTURE_DATA = 3;
const unsigned short HAVE_ENOUGH_DATA = 4;
readonly attribute unsigned short readyState;
readonly attribute boolean seeking;
// playback state
attribute double currentTime;
readonly attribute double initialTime;
readonly attribute double duration;
readonly attribute Date startOffsetTime;
readonly attribute boolean paused;
attribute double defaultPlaybackRate;
attribute double playbackRate;
readonly attribute TimeRanges played;
readonly attribute TimeRanges seekable;
readonly attribute boolean ended;
attribute boolean autoplay;
attribute boolean loop;
void play();
void pause();
// controls
attribute boolean controls;
attribute double volume;
attribute boolean muted;
// text tracks
readonly attribute TextTrack[] tracks;
MutableTextTrack addTrack(in DOMString kind, in optional DOMString label, in optional DOMString language);
};
The media element attributes, src
, preload
, autoplay
, loop
, and controls
, apply to all media elements. They are defined in
this section.
Media elements are used to present audio data, or video and audio data, to the user. This is referred to as media data in this section, since this section applies equally to media elements for audio or for video. The term media resource is used to refer to the complete set of media data, e.g. the complete video file, or complete audio file.
Except where otherwise specified, the task source for all the tasks queued in this section and its subsections is the media element event task source.
error
Returns a MediaError
object representing the
current error state of the element.
Returns null if there is no error.
All media elements have an
associated error status, which records the last error the element
encountered since its resource selection
algorithm was last invoked. The error
attribute, on
getting, must return the MediaError
object created for
this last error, or null if there has not been an error.
interface MediaError { const unsigned short MEDIA_ERR_ABORTED = 1; const unsigned short MEDIA_ERR_NETWORK = 2; const unsigned short MEDIA_ERR_DECODE = 3; const unsigned short MEDIA_ERR_SRC_NOT_SUPPORTED = 4; readonly attribute unsigned short code; };
error
. code
Returns the current error's error code, from the list below.
The code
attribute of a MediaError
object must return the code
for the error, which must be one of the following:
MEDIA_ERR_ABORTED
(numeric value 1)MEDIA_ERR_NETWORK
(numeric value 2)MEDIA_ERR_DECODE
(numeric value 3)MEDIA_ERR_SRC_NOT_SUPPORTED
(numeric value 4)src
attribute was not suitable.The src
content
attribute on media elements gives
the address of the media resource (video, audio) to show. The
attribute, if present, must contain a valid non-empty
URL potentially surrounded by spaces.
If a src
attribute of a
media element is set or changed, the user agent must
invoke the media element's media element load
algorithm. (Removing the src
attribute does not do this, even
if there are source
elements present.)
The src
IDL
attribute on media elements must
reflect the content attribute of the same name.
currentSrc
Returns the address of the current media resource.
Returns the empty string when there is no media resource.
The currentSrc
IDL
attribute is initially the empty string. Its value is changed by the
resource selection
algorithm defined below.
There are two ways to specify a media
resource, the src
attribute, or source
elements. The attribute overrides
the elements.
A media resource can be described in terms of its
type, specifically a MIME type, in some cases
with a codecs
parameter. (Whether the codecs
parameter is allowed or not depends on the
MIME type.) [RFC4281]
Types are usually somewhat incomplete descriptions; for example
"video/mpeg
" doesn't say anything except what
the container type is, and even a type like "video/mp4; codecs="avc1.42E01E,
mp4a.40.2"
" doesn't include information like the actual
bitrate (only the maximum bitrate). Thus, given a type, a user agent
can often only know whether it might be able to play
media of that type (with varying levels of confidence), or whether
it definitely cannot play media of that type.
A type that the user agent knows it cannot render is one that describes a resource that the user agent definitely does not support, for example because it doesn't recognize the container type, or it doesn't support the listed codecs.
The MIME type
"application/octet-stream
" with no parameters is never
a type that the user agent knows it cannot render. User
agents must treat that type as equivalent to the lack of any
explicit Content-Type metadata
when it is used to label a potential media
resource.
In the absence of a
specification to the contrary, the MIME type
"application/octet-stream
" when used with
parameters, e.g.
"application/octet-stream;codecs=theora
", is
a type that the user agent knows it cannot render,
since that parameter is not defined for that type.
canPlayType
(type)Returns the empty string (a negative response), "maybe", or "probably" based on how confident the user agent is that it can play media resources of the given type.
The canPlayType(type)
method must return the empty
string if type is a type that the user
agent knows it cannot render or is the type
"application/octet-stream
"; it must return "probably
" if the user agent is confident that the
type represents a media resource that it can render if
used in with this audio
or video
element;
and it must return "maybe
" otherwise.
Implementors are encouraged to return "maybe
"
unless the type can be confidently established as being supported or
not. Generally, a user agent should never return "probably
" for a type that allows the codecs
parameter if that parameter is not
present.
This script tests to see if the user agent supports a
(fictional) new format to dynamically decide whether to use a
video
element or a plugin:
<section id="video"> <p><a href="playing-cats.nfv">Download video</a></p> </section> <script> var videoSection = document.getElementById('video'); var videoElement = document.createElement('video'); var support = videoElement.canPlayType('video/x-new-fictional-format;codecs="kittens,bunnies"'); if (support != "probably" && "New Fictional Video Plug-in" in navigator.plugins) { // not confident of browser support // but we have a plugin // so use plugin instead videoElement = document.createElement("embed"); } else if (support == "") { // no support from browser and no plugin // do nothing videoElement = null; } if (videoElement) { while (videoSection.hasChildNodes()) videoSection.removeChild(videoSection.firstChild); videoElement.setAttribute("src", "playing-cats.nfv"); videoSection.appendChild(videoElement); } </script>
The type
attribute of the source
element allows the user agent
to avoid downloading resources that use formats it cannot
render.
networkState
Returns the current state of network activity for the element, from the codes in the list below.
As media elements interact
with the network, their current network activity is represented by
the networkState
attribute. On getting, it must return the current network state of
the element, which must be one of the following values:
NETWORK_EMPTY
(numeric value 0)NETWORK_IDLE
(numeric value 1)NETWORK_LOADING
(numeric value 2)NETWORK_NO_SOURCE
(numeric value 3)The resource selection
algorithm defined below describes exactly when the networkState
attribute changes
value and what events fire to indicate changes in this state.
load
()Causes the element to reset and start selecting and loading a new media resource from scratch.
All media elements have an autoplaying flag, which must begin in the true state, and a delaying-the-load-event flag, which must begin in the false state. While the delaying-the-load-event flag is true, the element must delay the load event of its document.
When the load()
method on a media element is invoked, the user agent
must run the media element load algorithm.
The media element load algorithm consists of the following steps.
Abort any already-running instance of the resource selection algorithm for this element.
If there are any tasks from the media element's media element event task source in one of the task queues, then remove those tasks.
Basically, pending events and callbacks for the media element are discarded when the media element starts loading a new resource.
If the media element's networkState
is set to NETWORK_LOADING
or NETWORK_IDLE
, queue a
task to fire a simple event named abort
at the media
element.
If the media element's networkState
is not set to
NETWORK_EMPTY
, then
run these substeps:
If a fetching process is in progress for the media element, the user agent should stop it.
Set the networkState
attribute to
NETWORK_EMPTY
.
Forget the media element's media-resource-specific text tracks.
If readyState
is
not set to HAVE_NOTHING
, then set it
to that state.
If the paused
attribute is false, then set to true.
If seeking
is true,
set it to false.
Set the current playback position to 0.
If this changed the current playback position,
then queue a task to fire a simple
event named timeupdate
at the
media element.
Set the initial playback position to 0.
Set the timeline offset to Not-a-Number (NaN).
Update the duration
attribute to Not-a-Number (NaN).
The user agent will
not fire a durationchange
event
for this particular change of the duration.
Queue a task to fire a simple
event named emptied
at the media
element.
Set the playbackRate
attribute to the
value of the defaultPlaybackRate
attribute.
Set the error
attribute
to null and the autoplaying flag to true.
Invoke the media element's resource selection algorithm.
Playback of any previously playing media resource for this element stops.
The resource selection algorithm for a media element is as follows. This algorithm is always invoked synchronously, but one of the first steps in the algorithm is to return and continue running the remaining steps asynchronously, meaning that it runs in the background with scripts and other tasks running in parallel. In addition, this algorithm interacts closely with the event loop mechanism; in particular, it has synchronous sections (which are triggered as part of the event loop algorithm). Steps in such sections are marked with ⌛.
Set the networkState
to NETWORK_NO_SOURCE
.
Asynchronously await a stable state, allowing the task that invoked this algorithm to continue. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)
⌛ If the media element has a src
attribute, then let mode be attribute.
⌛ Otherwise, if the media element does not
have a src
attribute but has a
source
element child, then let mode be children and let candidate be the first such source
element child in tree order.
⌛ Otherwise the media element has neither a
src
attribute nor a
source
element child: set the networkState
to NETWORK_EMPTY
, and abort
these steps; the synchronous section ends.
⌛ Set the media element's
delaying-the-load-event flag to true (this delays the load event), and set
its networkState
to
NETWORK_LOADING
.
⌛ Queue a task to fire a simple
event named loadstart
at the media
element.
If mode is attribute, then run these substeps:
⌛ Process candidate: If the src
attribute's value is the empty
string, then end the synchronous section, and jump
down to the failed step below.
⌛ Let absolute URL be the
absolute URL that would have resulted from resolving the URL
specified by the src
attribute's value relative to the media element when
the src
attribute was last
changed.
⌛ If absolute URL was obtained
successfully, set the currentSrc
attribute to absolute URL.
End the synchronous section, continuing the remaining steps asynchronously.
If absolute URL was obtained successfully, run the resource fetch algorithm with absolute URL. If that algorithm returns without aborting this one, then the load failed.
Failed: Reaching this step indicates that the media resource failed to load or that the given URL could not be resolved. In one atomic operation, run the following steps:
Set the error
attribute to a new MediaError
object whose code
attribute is set to
MEDIA_ERR_SRC_NOT_SUPPORTED
.
Forget the media element's media-resource-specific text tracks.
Set the element's networkState
attribute to
the NETWORK_NO_SOURCE
value.
Queue a task to fire a simple
event named error
at the media element.
Set the element's delaying-the-load-event flag to false. This stops delaying the load event.
Abort these steps. Until the load()
method is invoked or the
src
attribute is changed, the
element won't attempt to load another resource.
Otherwise, the source
elements will be used; run
these substeps:
⌛ Let pointer be a position defined by two adjacent nodes in the media element's child list, treating the start of the list (before the first child in the list, if any) and end of the list (after the last child in the list, if any) as nodes in their own right. One node is the node before pointer, and the other node is the node after pointer. Initially, let pointer be the position between the candidate node and the next node, if there are any, or the end of the list, if it is the last node.
As nodes are inserted and removed into the media element, pointer must be updated as follows:
Other changes don't affect pointer.
⌛ Process candidate: If candidate does not have a src
attribute, or if its src
attribute's value is the empty
string, then end the synchronous section, and jump
down to the failed step below.
⌛ Let absolute URL be the
absolute URL that would have resulted from resolving the URL
specified by candidate's src
attribute's value relative to
the candidate when the src
attribute was last
changed.
⌛ If absolute URL was not obtained successfully, then end the synchronous section, and jump down to the failed step below.
⌛ If candidate has a type
attribute whose value, when
parsed as a MIME type (including any codecs
described by the codecs
parameter, for
types that define that parameter), represents a type that
the user agent knows it cannot render, then end the
synchronous section, and jump down to the failed step below.
⌛ If candidate has a media
attribute whose value does
not match the
environment, then end the synchronous
section, and jump down to the failed step
below.
⌛ Set the currentSrc
attribute to absolute URL.
End the synchronous section, continuing the remaining steps asynchronously.
Run the resource fetch algorithm with absolute URL. If that algorithm returns without aborting this one, then the load failed.
Failed: Queue a task to
fire a simple event named error
at the candidate element, in the context of the fetching process that was used to try to
obtain candidate's corresponding media
resource in the resource fetch
algorithm.
Asynchronously await a stable state. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)
⌛ Forget the media element's media-resource-specific text tracks.
⌛ Find next candidate: Let candidate be null.
⌛ Search loop: If the node after pointer is the end of the list, then jump to the waiting step below.
⌛ If the node after pointer is
a source
element, let candidate
be that element.
⌛ Advance pointer so that the node before pointer is now the node that was after pointer, and the node after pointer is the node after the node that used to be after pointer, if any.
⌛ If candidate is null, jump back to the search loop step. Otherwise, jump back to the process candidate step.
⌛ Waiting: Set the element's networkState
attribute to
the NETWORK_NO_SOURCE
value.
⌛ Set the element's delaying-the-load-event flag to false. This stops delaying the load event.
End the synchronous section, continuing the remaining steps asynchronously.
Wait until the node after pointer is a node other than the end of the list. (This step might wait forever.)
Asynchronously await a stable state. The synchronous section consists of all the remaining steps of this algorithm until the algorithm says the synchronous section has ended. (Steps in synchronous sections are marked with ⌛.)
⌛ Set the element's delaying-the-load-event flag back to true (this delays the load event again, in case it hasn't been fired yet).
⌛ Set the networkState
back to NETWORK_LOADING
.
⌛ Jump back to the find next candidate step above.
The resource fetch algorithm for a media element and a given absolute URL is as follows:
Let the current media resource be the resource given by the absolute URL passed to this algorithm. This is now the element's media resource.
Begin to fetch the current media
resource, from the media element's
Document
's origin, with the force
same-origin flag set.
Every 350ms (±200ms) or for every byte received, whichever
is least frequent, queue a task to
fire a simple event named progress
at the element.
The stall timeout is a user-agent defined length of
time, which should be about three seconds. When a media
element that is actively attempting to obtain media
data has failed to receive any data for a duration equal to
the stall timeout, the user agent must queue a
task to fire a simple event named stalled
at the element.
User agents may allow users to selectively block or slow media data downloads. When a media element's download has been blocked altogether, the user agent must act as if it was stalled (as opposed to acting as if the connection was closed). The rate of the download may also be throttled automatically by the user agent, e.g. to balance the download with other connections sharing the same bandwidth.
User agents may decide to not download more content at any
time, e.g. after buffering five minutes of a one hour media
resource, while waiting for the user to decide whether to play the
resource or not, or while waiting for user input in an interactive
resource. When a media element's download has been
suspended, the user agent must set the networkState
to NETWORK_IDLE
and queue
a task to fire a simple event named suspend
at the element. If and
when downloading of the resource resumes, the user agent must set
the networkState
to
NETWORK_LOADING
.
The preload
attribute provides a
hint regarding how much buffering the author thinks is advisable,
even in the absence of the autoplay
attribute.
When a user agent decides to completely stall a download, e.g. if it is waiting until the user starts playback before downloading any further content, the element's delaying-the-load-event flag must be set to false. This stops delaying the load event.
The user agent may use whatever means necessary to fetch the resource (within the constraints put forward by this and other specifications); for example, reconnecting to the server in the face of network errors, using HTTP range retrieval requests, or switching to a streaming protocol. The user agent must consider a resource erroneous only if it has given up trying to fetch it.
The networking task source tasks to process the data as it is being fetched must, when appropriate, include the relevant substeps from the following list:
codecs
parameter, if the
parameter is defined for that type), represents a type that
the user agent knows it cannot render (even if the actual
media data is in a supported format)DNS errors, HTTP 4xx and 5xx errors (and equivalents in other protocols), and other fatal network errors that occur before the user agent has established whether the current media resource is usable, as well as the file using an unsupported container format, or using unsupported codecs for all the data, must cause the user agent to execute the following steps:
The user agent should cancel the fetching process.
Abort this subalgorithm, returning to the resource selection algorithm.
This indicates that the resource is usable. The user agent must follow these substeps:
Establish the media timeline for the purposes of the current playback position, the earliest possible position, and the initial playback position, based on the media data.
Update the timeline offset to the date and time that corresponds to the zero time in the media timeline established in the previous step, if any. If no explicit time and date is given by the media resource, the timeline offset must be set to Not-a-Number (NaN).
Set the current playback position to the earliest possible position.
Update the duration
attribute with the time of the last frame of the resource, if
known, on the media timeline established above.
If it is not known (e.g. a stream that is in principle
infinite), update the duration
attribute to the
value positive Infinity.
The user agent will queue a task to
fire a simple event named durationchange
at the
element at this point.
Set the readyState
attribute to
HAVE_METADATA
.
For video
elements, set the videoWidth
and videoHeight
attributes.
Queue a task to fire a simple
event named loadedmetadata
at the
element.
Before this task is run, as part of the event
loop mechanism, the rendering will have been updated to resize
the video
element if appropriate.
If either the media resource or the address of the current media resource indicate a particular start time, then set the initial playback position to that time and then seek seek to that time. Ignore any resulting exceptions (if the position is out of range, it is effectively ignored).
For example, a fragment identifier could be used to indicate a start position.
Once the readyState
attribute
reaches HAVE_CURRENT_DATA
,
after the loadeddata
event has been
fired, set the element's delaying-the-load-event
flag to false. This stops delaying the load event.
A user agent that is attempting to reduce
network usage while still fetching the metadata for each
media resource would also stop buffering at this
point, causing the networkState
attribute
to switch to the NETWORK_IDLE
value.
The user agent is required to determine the duration of the media resource and go through this step before playing.
Queue a task to fire a simple event
named progress
at the
media element.
Fatal network errors that occur after the user agent has established whether the current media resource is usable must cause the user agent to execute the following steps:
The user agent should cancel the fetching process.
Set the error
attribute to a new MediaError
object whose code
attribute is set to
MEDIA_ERR_NETWORK
.
Queue a task to fire a simple
event named error
at the media element.
If the media element's readyState
attribute has a
value equal to HAVE_NOTHING
, set the
element's networkState
attribute to
the NETWORK_EMPTY
value and queue a task to fire a simple
event named emptied
at the element. Otherwise, set the element's networkState
attribute to
the NETWORK_IDLE
value.
Set the element's delaying-the-load-event flag to false. This stops delaying the load event.
Abort the overall resource selection algorithm.
Fatal errors in decoding the media data that occur after the user agent has established whether the current media resource is usable must cause the user agent to execute the following steps:
The user agent should cancel the fetching process.
Set the error
attribute to a new MediaError
object whose code
attribute is set to
MEDIA_ERR_DECODE
.
Queue a task to fire a simple
event named error
at the media element.
If the media element's readyState
attribute has a
value equal to HAVE_NOTHING
, set the
element's networkState
attribute to
the NETWORK_EMPTY
value and queue a task to fire a simple
event named emptied
at the element. Otherwise, set the element's networkState
attribute to
the NETWORK_IDLE
value.
Set the element's delaying-the-load-event flag to false. This stops delaying the load event.
Abort the overall resource selection algorithm.
The fetching process is aborted by the user, e.g. because the
user navigated the browsing context to another page, the user
agent must execute the following steps. These steps are not
followed if the load()
method itself is invoked while these steps are running, as the
steps above handle that particular kind of abort.
The user agent should cancel the fetching process.
Set the error
attribute to a new MediaError
object whose code
attribute is set to
MEDIA_ERR_ABORTED
.
Queue a task to fire a simple
event named abort
at the media element.
If the media element's readyState
attribute has a
value equal to HAVE_NOTHING
, set the
element's networkState
attribute to
the NETWORK_EMPTY
value and queue a task to fire a simple
event named emptied
at the element. Otherwise, set the element's networkState
attribute to
the NETWORK_IDLE
value.
Set the element's delaying-the-load-event flag to false. This stops delaying the load event.
Abort the overall resource selection algorithm.
The server returning data that is partially usable but cannot be optimally rendered must cause the user agent to render just the bits it can handle, and ignore the rest.
If the media resource's origin is
the same origin as the media element's
Document
's origin, queue a
task to run the steps to expose a
media-resource-specific text track with the relevant
data.
Cross-origin files do not expose their subtitles in the DOM, for security reasons. However, user agents may still provide the user with access to such data in their user interface.
When the networking task source has queued the last task as part of fetching the media resource (i.e. once the download has completed), if the fetching process completes without errors, including decoding the media data, and if all of the data is available to the user agent without network access, then, the user agent must move on to the next step. This might never happen, e.g. when streaming an infinite resource such as Web radio, or if the resource is longer than the user agent's ability to cache data.
While the user agent might still need network access to obtain parts of the media resource, the user agent must remain on this step.
For example, if the user agent has discarded
the first half of a video, the user agent will remain at this step
even once the playback has
ended, because there is always the chance the user will
seek back to the start. In fact, in this situation, once playback has ended, the user agent
will end up dispatching a stalled
event, as described
earlier.
If the user agent ever reaches this step (which can only happen if the entire resource gets loaded and kept available): abort the overall resource selection algorithm.
The preload
attribute is an enumerated attribute. The following table
lists the keywords and states for the attribute — the keywords
in the left column map to the states in the cell in the second
column on the same row as the keyword.
Keyword | State | Brief description |
---|---|---|
none
| None | Hints to the user agent that either the author does not expect the user to need the media resource, or that the server wants to minimise unnecessary traffic. |
metadata
| Metadata | Hints to the user agent that the author does not expect the user to need the media resource, but that fetching the resource metadata (dimensions, first frame, track list, duration, etc) is reasonable. |
auto
| Automatic | Hints to the user agent that the user agent can put the user's needs first without risk to the server, up to and including optimistically downloading the entire resource. |
The empty string is also a valid keyword, and maps to the Automatic state. The attribute's missing value default is user-agent defined, though the Metadata state is suggested as a compromise between reducing server load and providing an optimal user experience.
The preload
attribute is
intended to provide a hint to the user agent about what the author
thinks will lead to the best user experience. The attribute may be
ignored altogether, for example based on explicit user preferences
or based on the available connectivity.
The preload
IDL
attribute must reflect the content attribute of the
same name, limited to only known values.
The autoplay
attribute can override
the preload
attribute (since
if the media plays, it naturally has to buffer first, regardless of
the hint given by the preload
attribute). Including
both is not an error, however.
buffered
Returns a TimeRanges
object that represents the
ranges of the media resource that the user agent has
buffered.
The buffered
attribute must return a new static normalized
TimeRanges
object that represents the ranges of
the media resource, if any, that the user agent has
buffered, at the time the attribute is evaluated. Users agents must
accurately determine the ranges available, even for media streams
where this can only be determined by tedious inspection.
Typically this will be a single range anchored at the zero point, but if, e.g. the user agent uses HTTP range requests in response to seeking, then there could be multiple ranges.
User agents may discard previously buffered data.
Thus, a time position included within a range of the
objects return by the buffered
attribute at one time can
end up being not included in the range(s) of objects returned by the
same attribute at later times.
duration
Returns the length of the media resource, in seconds, assuming that the start of the media resource is at time zero.
Returns NaN if the duration isn't available.
Returns Infinity for unbounded streams.
currentTime
[ = value ]Returns the current playback position, in seconds.
Can be set, to seek to the given time.
Will throw an INVALID_STATE_ERR
exception if there
is no selected media resource. Will throw an
INDEX_SIZE_ERR
exception if the given time is not
within the ranges to which the user agent can seek.
initialTime
Returns the initial playback position, that is, time to which the media resource was automatically seeked when it was loaded. Returns zero if the initial playback position is still unknown.
A media resource has a media timeline that maps times (in seconds) to positions in the media resource. The origin of a timeline is its earliest defined position. The duration of a timeline is its last defined position.
Establishing the media timeline: If the media
resource somehow specifies an explicit timeline whose origin
is not negative, then the media timeline should be that
timeline. (Whether the media resource can specify a
timeline or not depends on the media
resource's format.) If the media resource
specifies an explicit start time and date, then that time
and date should be considered the zero point in the media
timeline; the timeline offset will be the time
and date, exposed using the startOffsetTime
attribute.
If the media resource has a discontinuous timeline, the user agent must extend the timeline used at the start of the resource across the entire resource, so that the media timeline of the media resource increases linearly starting from the earliest possible position (as defined below), even if the underlying media data has out-of-order or even overlapping time codes.
For example, if two clips have been concatenated into one video file, but the video format exposes the original times for the two clips, the video data might expose a timeline that goes, say, 00:15..00:29 and then 00:05..00:38. However, the user agent would not expose those times; it would instead expose the times as 00:15..00:29 and 00:29..01:02, as a single video.
In the absence of an explicit timeline, the zero time on the media timeline should correspond to the first frame of the media resource. For static audio and video files this is generally trivial. For streaming resources, if the user agent will be able to seek to an earlier point than the first frame originally provided by the server, then the zero time should correspond to the earliest seekable time of the media resource; otherwise, it should correspond to the first frame received from the server (the point in the media resource at which the user agent began receiving the stream).
Another example would be a stream that carries a
video with several concatenated fragments, broadcast by a server
that does not allow user agents to request specific times but
instead just streams the video data in a predetermined order. If a
user agent connects to this stream and receives fragments defined as
covering timestamps 2010-03-20 23:15:00 UTC to 2010-03-21 00:05:00
UTC and 2010-02-12 14:25:00 UTC to 2010-02-12 14:35:00 UTC, it would
expose this with a media timeline starting at 0s and
extending to 3,600s (one hour). Assuming the streaming server
disconnected at the end of the second clip, the duration
attribute would then
return 3,600. The startOffsetTime
attribute
would return a Date
object with a time corresponding to
2010-03-20 23:15:00 UTC. However, if a different user agent
connected five minutes later, it would (presumably) receive
fragments covering timestamps 2010-03-20 23:20:00 UTC to 2010-03-21
00:05:00 UTC and 2010-02-12 14:25:00 UTC to 2010-02-12 14:35:00 UTC,
and would expose this with a media timeline starting at
0s and extending to 3,300s (fifty five minutes). In this case, the
startOffsetTime
attribute would return a Date
object with a time
corresponding to 2010-03-20 23:20:00 UTC.
In any case, the user agent must ensure that the earliest possible position (as defined below) using the established media timeline, is greater than or equal to zero.
Media elements have a current playback position, which must initially (i.e. in the absence of media data) be zero seconds. The current playback position is a time on the media timeline.
The currentTime
attribute must, on getting, return the current playback
position, expressed in seconds. On setting, the user agent
must seek to the new value
(which might raise an exception).
Media elements have an initial playback position, which must initially (i.e. in the absence of media data) be zero seconds. The initial playback position is updated when a media resource is loaded. The initial playback position is a time on the media timeline.
The initialTime
attribute must, on getting, return the initial playback
position, expressed in seconds.
If the media resource is a streaming resource, then the user agent might be unable to obtain certain parts of the resource after it has expired from its buffer. Similarly, some media resources might have a media timeline that doesn't start at zero. The earliest possible position is the earliest position in the stream or resource that the user agent can ever obtain again. It is also a time on the media timeline.
The earliest possible position is not
explicitly exposed in the API; it corresponds to the start time of
the first range in the seekable
attribute's
TimeRanges
object, if any, or the current
playback position otherwise.
When the earliest possible position changes, then:
if the current playback position is before the
earliest possible position, the user agent must seek to the earliest possible
position; otherwise, if the user agent has not fired a timeupdate
event at the
element in the past 15 to 250ms and is not still running event
handlers for such an event, then the user agent must queue a
task to fire a simple event named timeupdate
at the element.
Because of the above requirement and the requirement in the resource fetch algorithm that kicks in when the metadata of the clip becomes known, the current playback position can never be less than the earliest possible position.
The duration
attribute must return the time of the end of the media
resource, in seconds, on the media timeline. If
no media data is available, then the attributes must
return the Not-a-Number (NaN) value. If the media
resource is known to be unbounded (e.g. a streaming radio),
then the attribute must return the positive Infinity value.
The user agent must determine the duration of the media
resource before playing any part of the media
data and before setting readyState
to a value equal to
or greater than HAVE_METADATA
, even if doing
so requires fetching multiple parts of the resource.
When the length of the media
resource changes to a known value (e.g. from being unknown to
known, or from a previously established length to a new length) the
user agent must queue a task to fire a simple
event named durationchange
at the
media element. (The event is not fired when the
duration is reset as part of loading a new media resource.)
If an "infinite" stream ends for some reason,
then the duration would change from positive Infinity to the time of
the last frame or sample in the stream, and the durationchange
event would
be fired. Similarly, if the user agent initially estimated the
media resource's duration instead of determining it
precisely, and later revises the estimate based on new information,
then the duration would change and the durationchange
event would
be fired.
Some video files also have an explicit date and time corresponding to the zero time in the media timeline, known as the timeline offset. Initially, the timeline offset must be set to Not-a-Number (NaN).
The startOffsetTime
attribute must return a new Date
object representing
the current timeline offset.
The loop
attribute is a boolean attribute that, if specified,
indicates that the media element is to seek back to the
start of the media resource upon reaching the end.
The loop
IDL
attribute must reflect the content attribute of the
same name.
readyState
Returns a value that expresses the current state of the element with respect to rendering the current playback position, from the codes in the list below.
Media elements have a ready state, which describes to what degree they are ready to be rendered at the current playback position. The possible values are as follows; the ready state of a media element at any particular time is the greatest value describing the state of the element:
HAVE_NOTHING
(numeric value 0)networkState
attribute are set to NETWORK_EMPTY
are always in
the HAVE_NOTHING
state.HAVE_METADATA
(numeric value 1)video
element, the dimensions of the video are also available. The API
will no longer raise an exception when seeking. No media
data is available for the immediate current playback
position.
The text tracks
are ready.
HAVE_CURRENT_DATA
(numeric value 2)HAVE_METADATA
state, or
there is no more data to obtain in the direction of
playback. For example, in video this corresponds to the user
agent having data from the current frame, but not the next frame;
and to when playback has
ended.HAVE_FUTURE_DATA
(numeric value 3)HAVE_METADATA
state. For example, in video this corresponds to the user agent
having data for at least the current frame and the next frame. The
user agent cannot be in this state if playback has ended, as the current playback
position can never advance in this case.HAVE_ENOUGH_DATA
(numeric value 4)HAVE_FUTURE_DATA
state
are met, and, in addition, the user agent estimates that data is
being fetched at a rate where the current playback
position, if it were to advance at the rate given by the
defaultPlaybackRate
attribute, would not overtake the available data before playback
reaches the end of the media resource.When the ready state of a media element whose networkState
is not NETWORK_EMPTY
changes, the
user agent must follow the steps given below:
HAVE_NOTHING
, and the new
ready state is HAVE_METADATA
A loadedmetadata
DOM event will be fired as part of the load()
algorithm.
HAVE_METADATA
and
the new ready state is HAVE_CURRENT_DATA
or
greaterIf this is the first time this occurs for
this media element since the load()
algorithm was last invoked,
the user agent must queue a task to fire a
simple event named loadeddata
at the element.
If the new ready state is HAVE_FUTURE_DATA
or
HAVE_ENOUGH_DATA
,
then the relevant steps below must then be run also.
HAVE_FUTURE_DATA
or more,
and the new ready state is HAVE_CURRENT_DATA
or
lessA waiting
DOM
event can be fired,
depending on the current state of playback.
HAVE_CURRENT_DATA
or
less, and the new ready state is HAVE_FUTURE_DATA
The user agent must queue a task to fire a
simple event named canplay
.
If the element is potentially playing, the user
agent must queue a task to fire a simple
event named playing
.
HAVE_ENOUGH_DATA
If the previous ready state was HAVE_CURRENT_DATA
or
less, the user agent must queue a task to fire
a simple event named canplay
, and, if the element is also
potentially playing, queue a task to
fire a simple event named playing
.
If the autoplaying flag is true, and the paused
attribute is true, and the
media element has an autoplay
attribute specified,
and the media element's Document
's
browsing context did not have the sandboxed
automatic features browsing context flag set when the
Document
was created, then the user agent may also
set the paused
attribute to
false, queue a task to fire a simple
event named play
, and
queue a task to fire a simple event
named playing
.
User agents are not required to autoplay, and it
is suggested that user agents honor user preferences on the
matter. Authors are urged to use the autoplay
attribute rather than
using script to force the video to play, so as to allow the user
to override the behavior if so desired.
In any case, the user agent must finally queue a
task to fire a simple event named canplaythrough
.
It is possible for the ready state of a media
element to jump between these states discontinuously. For example,
the state of a media element can jump straight from HAVE_METADATA
to HAVE_ENOUGH_DATA
without
passing through the HAVE_CURRENT_DATA
and
HAVE_FUTURE_DATA
states.
The readyState
IDL
attribute must, on getting, return the value described above that
describes the current ready state of the media
element.
The autoplay
attribute is a boolean attribute. When present, the
user agent (as described in the algorithm
described herein) will automatically begin playback of the
media resource as soon as it can do so without
stopping.
Authors are urged to use the autoplay
attribute rather than
using script to trigger automatic playback, as this allows the user
to override the automatic playback when it is not desired, e.g. when
using a screen reader. Authors are also encouraged to consider not
using the automatic playback behavior at all, and instead to let the
user agent wait for the user to start playback explicitly.
The autoplay
IDL attribute must reflect the content attribute of the
same name.
paused
Returns true if playback is paused; false otherwise.
ended
Returns true if playback has reached the end of the media resource.
defaultPlaybackRate
[ = value ]Returns the default rate of playback, for when the user is not fast-forwarding or reversing through the media resource.
Can be set, to change the default rate of playback.
The default rate has no direct effect on playback, but if the user switches to a fast-forward mode, when they return to the normal playback mode, it is expected that the rate of playback will be returned to the default rate of playback.
playbackRate
[ = value ]Returns the current rate playback, where 1.0 is normal speed.
Can be set, to change the rate of playback.
played
Returns a TimeRanges
object that represents the
ranges of the media resource that the user agent has
played.
play
()Sets the paused
attribute
to false, loading the media resource and beginning
playback if necessary. If the playback had ended, will restart it
from the start.
pause
()Sets the paused
attribute
to true, loading the media resource if necessary.
The paused
attribute represents whether the media element is
paused or not. The attribute must initially be true.
A media element is said to be potentially
playing when its paused
attribute is false, the readyState
attribute is either
HAVE_FUTURE_DATA
or
HAVE_ENOUGH_DATA
,
the element has not ended playback, playback has not
stopped due to errors, and the element has not paused
for user interaction.
A media element is said to have ended
playback when the element's readyState
attribute is HAVE_METADATA
or greater, and
either the current playback position is the end of the
media resource and the direction of
playback is forwards and the media element does
not have a loop
attribute
specified, or the current playback position is the
earliest possible position and the direction of
playback is backwards.
The ended
attribute must return true if the media element has
ended playback and the direction of
playback is forwards, and false otherwise.
A media element is said to have stopped due to
errors when the element's readyState
attribute is HAVE_METADATA
or greater, and
the user agent encounters a
non-fatal error during the processing of the media
data, and due to that error, is not able to play the content
at the current playback position.
A media element is said to have paused for user
interaction when its paused
attribute is false, the readyState
attribute is either
HAVE_FUTURE_DATA
or
HAVE_ENOUGH_DATA
and
the user agent has reached a point in the media
resource where the user has to make a selection for the
resource to continue.
It is possible for a media element to have both ended playback and paused for user interaction at the same time.
When a media element that is potentially
playing stops playing because it has paused for user
interaction, the user agent must queue a task to
fire a simple event named timeupdate
at the element.
When a media element
that is potentially playing stops playing because its
readyState
attribute
changes to a value lower than HAVE_FUTURE_DATA
, without
the element having ended playback, or playback having
stopped due to errors, or playback having paused
for user interaction, or the seeking algorithm being invoked, the
user agent must queue a task to fire a simple
event named timeupdate
at the element, and queue a task to fire a simple
event named waiting
at
the element.
When the current playback position reaches the end of the media resource when the direction of playback is forwards, then the user agent must follow these steps:
If the media element has a loop
attribute specified, then seek to the earliest possible
position of the media resource and abort these
steps.
Stop playback.
The ended
attribute becomes
true.
The user agent must queue a task to fire
a simple event named timeupdate
at the element.
The user agent must queue a task to fire
a simple event named ended
at the element.
When the current playback position reaches the earliest possible position of the media resource when the direction of playback is backwards, then the user agent must follow these steps:
Stop playback.
The user agent must queue a task to fire
a simple event named timeupdate
at the element.
The defaultPlaybackRate
attribute gives the desired speed at which the media
resource is to play, as a multiple of its intrinsic
speed. The attribute is mutable: on getting it must return the last
value it was set to, or 1.0 if it hasn't yet been set; on setting
the attribute must be set to the new value.
The playbackRate
attribute gives the speed at which the media resource
plays, as a multiple of its intrinsic speed. If it is not equal to
the defaultPlaybackRate
,
then the implication is that the user is using a feature such as
fast forward or slow motion playback. The attribute is mutable: on
getting it must return the last value it was set to, or 1.0 if it
hasn't yet been set; on setting the attribute must be set to the new
value, and the playback must change speed (if the element is
potentially playing).
If the playbackRate
is positive or zero, then the direction of playback is
forwards. Otherwise, it is backwards.
The "play" function in a user agent's interface must set the
playbackRate
attribute
to the value of the defaultPlaybackRate
attribute before invoking the play()
method's steps. Features such
as fast-forward or rewind must be implemented by only changing the
playbackRate
attribute.
When the defaultPlaybackRate
or
playbackRate
attributes
change value (either by being set by script or by being changed
directly by the user agent, e.g. in response to user control) the
user agent must queue a task to fire a simple
event named ratechange
at the media element.
The played
attribute must return a new static normalized
TimeRanges
object that represents the ranges of
the media resource, if any, that the user agent has so
far rendered, at the time the attribute is evaluated.
When the play()
method on a media element is invoked, the user agent
must run the following steps.
If the media element's networkState
attribute has
the value NETWORK_EMPTY
, invoke the
media element's resource selection
algorithm.
If the playback has ended and the direction of playback is forwards, seek to the earliest possible position of the media resource.
This will cause the user
agent to queue a task to fire a simple
event named timeupdate
at the media
element.
If the media element's paused
attribute is true, run
the following substeps:
Change the value of paused
to false.
Queue a task to fire a simple event
named play
at the element.
If the media element's readyState
attribute has the
value HAVE_NOTHING
,
HAVE_METADATA
, or
HAVE_CURRENT_DATA
,
queue a task to fire a simple event
named waiting
at the
element.
Otherwise, the media element's readyState
attribute has the
value HAVE_FUTURE_DATA
or
HAVE_ENOUGH_DATA
;
queue a task to fire a simple event
named playing
at the
element.
Set the media element's autoplaying flag to false.
When the pause()
method is invoked, the user agent must run the following steps:
If the media element's networkState
attribute has
the value NETWORK_EMPTY
, invoke the
media element's resource selection
algorithm.
Set the media element's autoplaying flag to false.
If the media element's paused
attribute is false, run the
following steps:
Change the value of paused
to true.
Queue a task to fire a simple
event named timeupdate
at the
element.
Queue a task to fire a simple
event named pause
at the element.
When a media element is
potentially playing and its Document
is a
fully active Document
, its current
playback position must increase monotonically at playbackRate
units of media
time per unit time of wall clock time.
This specification doesn't define how the user agent achieves the appropriate playback rate — depending on the protocol and media available, it is plausible that the user agent could negotiate with the server to have the server provide the media data at the appropriate rate, so that (except for the period between when the rate is changed and when the server updates the stream's playback rate) the client doesn't actually have to drop or interpolate any frames.
When the playbackRate
is negative (playback is backwards), any corresponding audio must be
muted. When the playbackRate
is so low or so
high that the user agent cannot play audio usefully, the
corresponding audio must also be muted. If the playbackRate
is not 1.0, the
user agent may apply pitch adjustments to the audio as necessary to
render it faithfully.
The playbackRate
can
be 0.0, in which case the current playback position
doesn't move, despite playback not being paused (paused
doesn't become true, and the
pause
event doesn't fire).
Media elements that are
potentially playing while not in a
Document
must not play any video, but should
play any audio component. Media elements must not stop playing just
because all references to them have been removed; only once a media
element to which no references exist has reached a point where no
further audio remains to be played for that element (e.g. because
the element is paused, or because the end of the clip has been
reached, or because its playbackRate
is 0.0) may the
element be garbage collected.
When the current playback position of a media element changes (e.g. due to playback or seeking), the user agent must run the following steps. If the current playback position changes while the steps are running, then the user agent must wait for the steps to complete, and then must immediately rerun the steps. (These steps are thus run as often as possible or needed — if one iteration takes a long time, this can cause certain cues to be skipped over as the user agent rushes ahead to "catch up".)
Let current cues be an ordered list of cues, initialized to contain all the cues of all the hidden, showing, or showing by default text tracks of the media element (not the disabled ones) whose start times are less than or equal to the current playback position and whose end times are greater than the current playback position, in text track cue order.
Let other cues be an ordered list of cues, initialized to contain all the cues of hidden, showing, and showing by default text tracks of the media element that are not present in current cues, also in text track cue order.
If the time was reached through the usual monotonic increase
of the current playback position during normal playback, and if the
user agent has not fired a timeupdate
event at the
element in the past 15 to 250ms and is not still running event
handlers for such an event, then the user agent must queue a
task to fire a simple event named timeupdate
at the
element. (In the other cases, such as explicit seeks, relevant
events get fired as part of the overall process of changing the
current playback position.)
The event thus is not to be fired faster than about 66Hz or slower than 4Hz (assuming the event handlers don't take longer than 250ms to run). User agents are encouraged to vary the frequency of the event based on the system load and the average cost of processing the event each time, so that the UI updates are not any more frequent than the user agent can comfortably handle while decoding the video.
If all of the cues in current cues have their text track cue active flag set, and none of the cues in other cues have their text track cue active flag set, then abort these steps.
If the time was reached through the usual monotonic increase
of the current playback position during normal playback, and there
are cues in other cues that have both their text track
cue active flag set and their text track cue
pause-on-exit flag set, then immediately act as if the
element's pause()
method had
been invoked. (In the other cases, such as explicit seeks,
playback is not paused by going past the end time of a cue, even if that cue has its text track cue pause-on-exit
flag set.)
Let affected tracks be a list of text tracks, initially empty.
For each text track
cue in other cues that has its
text track cue active flag set, in list order,
queue a task to fire a simple event named
exit
at the
TextTrackCue
object, and add the cue's text track to affected tracks, if it's not already in the
list.
For each text track
cue in current cues that does not have
its text track cue active flag set, in list order,
queue a task to fire a simple event named
enter
at the
TextTrackCue
object, and add the cue's text track to affected tracks, if it's not already in the
list.
For each text track in affected
tracks, in the order they were added to the list (which will
match the relative order of the text
tracks in the media element's list of
text tracks), queue a task to fire a
simple event named cuechange
at the
TextTrack
object, and, if the text
track has a corresponding track
element, to
then fire a simple event named cuechange
at the track
element as well.
Set the text track cue active flag of all the cues in the current cues, and unset the text track cue active flag of all the cues in the other cues.
Run the rules for updating the text track rendering of each of the text tracks in affected tracks that are showing or showing by default.
For the purposes of the algorithm above, a text track cue is considered to be part of a text track only if it is listed in the text track list of cues, not merely if it is associated with the text track.
When a media element is removed from a
Document
, if the media element's
networkState
attribute
has a value other than NETWORK_EMPTY
then the user
agent must act as if the pause()
method had been invoked.
If the media element's
Document
stops being a fully active
document, then the playback will stop
until the document is active again.
seeking
Returns true if the user agent is currently seeking.
seekable
Returns a TimeRanges
object that represents the
ranges of the media resource to which it is possible
for the user agent to seek.
The seeking
attribute must initially have the value false.
When the user agent is required to seek to a particular new playback position in the media resource, it means that the user agent must run the following steps. This algorithm interacts closely with the event loop mechanism; in particular, it has a synchronous section (which is triggered as part of the event loop algorithm). Steps in that section are marked with ⌛.
If the media element's readyState
is HAVE_NOTHING
, then raise an
INVALID_STATE_ERR
exception (if the seek was in
response to a DOM method call or setting of an IDL attribute), and
abort these steps.
If the element's seeking
IDL attribute is true,
then another instance of this algorithm is already running. Abort
that other instance of the algorithm without waiting for the step
that it is running to complete.
Set the seeking
IDL
attribute to true.
If the seek was in response to a DOM method call or setting of an IDL attribute, then continue the script. The remainder of these steps must be run asynchronously. With the exception of the steps marked with ⌛, they could be aborted at any time by another instance of this algorithm being invoked.
If the new playback position is later than the end of the media resource, then let it be the end of the media resource instead.
If the new playback position is less than the earliest possible position, let it be that position instead.
If the (possibly now changed) new playback
position is not in one of the ranges given in the seekable
attribute, then let it
be the position in one of the ranges given in the seekable
attribute that is the
nearest to the new playback position. If two
positions both satisfy that constraint (i.e. the new
playback position is exactly in the middle between two ranges
in the seekable
attribute)
then use the position that is closest to the current playback
position. If there are no ranges given in the seekable
attribute then set the
seeking
IDL attribute to
false and abort these steps.
Set the current playback position to the given new playback position.
Queue a task to fire a simple
event named seeking
at the element.
Queue a task to fire a
simple event named timeupdate
at the
element.
If the media element was potentially
playing immediately before it started seeking, but seeking
caused its readyState
attribute to change to a value lower than HAVE_FUTURE_DATA
, then
queue a task to fire a simple event named
waiting
at the
element.
Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position.
Await a stable state. The synchronous section consists of all the remaining steps of this algorithm. (Steps in the synchronous section are marked with ⌛.)
⌛ Set the seeking
IDL attribute to
false.
⌛ Queue a task to fire a simple
event named seeked
at the element.
The seekable
attribute must return a new static normalized
TimeRanges
object that represents the ranges of
the media resource, if any, that the user agent is able
to seek to, at the time the attribute is evaluated.
If the user agent can seek to anywhere in the
media resource, e.g. because it is a simple movie file
and the user agent and the server support HTTP Range requests, then
the attribute would return an object with one range, whose start is
the time of the first frame (the earliest possible
position, typically zero), and whose end is the same as the
time of the first frame plus the duration
attribute's value (which
would equal the time of the last frame, and might be positive
Infinity).
The range might be continuously changing, e.g. if the user agent is buffering a sliding window on an infinite stream. This is the behavior seen with DVRs viewing live TV, for instance.
Media resources might be internally scripted or interactive. Thus, a media element could play in a non-linear fashion. If this happens, the user agent must act as if the algorithm for seeking was used whenever the current playback position changes in a discontinuous fashion (so that the relevant events fire).
A media element can have a group of associated text tracks, known as the media element's list of text tracks. The text tracks are sorted as follows:
track
element children of the media
element, in tree order.addTrack()
method, in
the order they were added, oldest first.A text track consists of:
This decides how the track is handled by the user agent. The kind is represented by a string. The possible strings are:
subtitles
captions
descriptions
chapters
metadata
The kind of track can
change dynamically, in the case of a text track
corresponding to a track
element.
This is a human-readable string intended to identify the track for the user. In certain cases, the label might be generated automatically.
The label of a track can
change dynamically, in the case of a text track
corresponding to a track
element or in the case of an
automatically-generated label whose value depends on variable
factors such as the user's preferred user interface language.
This is a string (a BCP 47 language tag) representing the language of the text track's cues. [BCP47]
The language of a text
track can change dynamically, in the case of a text
track corresponding to a track
element.
One of the following:
Indicates that the text track is known to exist (e.g. it has
been declared with a track
element), but its cues
have not been obtained.
Indicates that the text track is loading and there have been no fatal errors encountered so far. Further cues might still be added to the track.
Indicates that the text track has been loaded with no fatal
errors. No new cues will be added to the track except if the
text track corresponds to a
MutableTextTrack
object.
Indicates that the text track was enabled, but when the user agent attempted to obtain it, this failed in some way (e.g. URL could not be resolved, network error, unknown text track format). Some or all of the cues are likely missing and will not be obtained.
The readiness state of a text track changes dynamically as the track is obtained.
One of the following:
Indicates that the text track is not active. Other than for the purposes of exposing the track in the DOM, the user agent is ignoring the text track. No cues are active, no events are fired, and the user agent will not attempt to obtain the track's cues.
Indicates that the text track is active, but that the user agent is not actively displaying the cues. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly.
Indicates that the text track is active. If no attempt has
yet been made to obtain the track's cues, the user agent will
perform such an attempt momentarily. The user agent is
maintaining a list of which cues are active, and events are
being fired accordingly. In addition, for text tracks whose
kind is subtitles
or captions
, the cues
are being displayed over the video as appropriate; for text
tracks whose kind is descriptions
,
the user agent is making the cues available to the user in a
non-visual fashion; and for text tracks whose kind is chapters
, the user
agent is making available to the user a mechanism by which the
user can navigate to any point in the media
resource by selecting a cue.
The showing by
default state is used in conjunction with the default
attribute on
track
elements to indicate that the text track was
enabled due to that attribute. This allows the user agent to
override the state if a later track is discovered that is more
appropriate per the user's preferences.
A list of text track cues, along with rules for updating the text track rendering.
The list of cues of a
text track can change dynamically, either because the
text track has not yet been loaded or is still loading, or because the text
track corresponds to a MutableTextTrack
object, whose API allows individual cues can be added or removed
dynamically.
Each text track has a corresponding
TextTrack
object.
The text tracks of a media element are ready if all the text tracks whose mode was not in the disabled state when the element's resource selection algorithm last started now have a text track readiness state of loaded or failed to load.
A text track cue is the unit of time-sensitive data in a text track, corresponding for instance for subtitles and captions to the text that appears at a particular time and disappears at another time.
Each text track cue consists of:
An arbitrary string.
A time, in seconds and fractions of a second, at which the cue becomes relevant.
A time, in seconds and fractions of a second, at which the cue stops being relevant.
A boolean indicating whether playback of the media resource is to pause when the cue stops being relevant.
A writing direction, either horizontal (a line extends horizontally and is positioned vertically, with consecutive lines displayed below each other), vertical growing left (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the left of each other), or vertical growing right (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the right of each other).
A number giving the size of the box within which the text of each line of the cue is to be aligned, to be interpreted as a percentage of the video, as defined by the writing direction.
The raw text of the cue, and rules for its interpretation, allowing the text to be rendered and converted to a DOM fragment.
A text track cue is immutable.
Each text track cue has a corresponding
TextTrackCue
object, and can be associated with a
particular text track. Once a text track
cue is associated with a particular text track,
the association is permanent.
In addition, each text track cue has two pieces of dynamic information:
This flag must be initially unset. The flag is used to ensure events are fired appropriately when the cue becomes active or inactive, and to make sure the right cues are rendered.
The user agent must synchronously unset this flag whenever the
text track cue is removed from its text
track's text track list of cues; whenever the
text track itself is removed from its media
element's list of text tracks or has its
text track mode changed to disabled; and whenever the media
element's readyState
is changed back to
HAVE_NOTHING
. When the
flag is unset in this way for one or more cues in text tracks that were showing or showing by default prior to the
relevant incident, the user agent must, after having unset the
flag for all the affected cues, apply the rules for updating
the text track rendering of those text tracks.
This is used as part of the rendering model, to keep cues in a consistent position. It must initially be empty. Whenever the text track cue active flag is unset, the user agent must empty the text track cue display state.
The text track cues of a media element's text tracks are ordered relative to each other in the text track cue order, which is determined as follows: first group the cues by their text track, with the groups being sorted in the same order as their text tracks appear in the media element's list of text tracks; then, within each group, cues must be sorted by their start time, earliest first; then, any cues with the same start time must be sorted by their end time, earliest first; and finally, any cues with identical end times must be sorted in the order they were created (so e.g. for cues from a WebVTT file, that would be the order in which the cues were listed in the file).
A media-resource-specific text track is a text track that corresponds to data found in the media resource.
Rules for processing and rendering such data are defined by the relevant specifications, e.g. the specification of the video format if the media resource is a video.
When a media resource contains data that the user agent recognises and supports as being equivalent to a text track, the user agent runs the steps to expose a media-resource-specific text track with the relevant data, as follows:
Associate the relevant data with a new text
track and its corresponding new TextTrack
object. The text track is a
media-resource-specific text track.
Set the new text track's kind, label, and language based on the semantics of the relevant data, as defined by the relevant specification.
Populate the new text track's list of cues with the cues parsed so far, folllowing the guidelines for exposing cues, and begin updating it dynamically as necessary.
Set the new text track's readiness state to the value that most correctly describes the current state, and begin updating it dynamically as necessary.
For example, if the relevant data in the media resource has been fully parsed and completely describes the cues, then the text track would be loaded. On the other hand, if the data for the cues is interleaved with the media data, and the media resource as a whole is still being downloaded, then the loading state might be more accurate.
Set the new text track's mode to the mode consistent with the user's preferences and the requirements of the relevant specification for the data.
Leave the text track list of cues empty, and associate with it the rules for updating the text track rendering appropriate for the format in question.
Add the new text track to the media element's list of text tracks.
When a media element is to forget the media element's media-resource-specific text tracks, the user agent must remove from the media element's list of text tracks all the media-resource-specific text tracks.
When a track
element is created, it must be
associated with a new text track (with its value set
as defined below) and its corresponding new TextTrack
object.
The text track kind is determined from the state of
the element's kind
attribute
according to the following table; for a state given in a cell of the
first column, the kind is the
string given in the second column:
State | String |
---|---|
Subtitles | subtitles
|
Captions | captions
|
Descriptions | descriptions
|
Chapters | chapters
|
Metadata | metadata
|
The text track label is the element's track label.
The text track language is the element's track language, if any, or the empty string otherwise.
As the kind
, label
, and srclang
attributes are added,
removed, or changed, the text track must update
accordingly, as per the definitions above.
Changes to the track URL are handled in the algorithm below.
The text track list of cues is initially empty. It is dynamically modified when the referenced file is parsed. Associated with the list are the rules for updating the text track rendering appropriate for the format in question; for WebVTT, this is the rules for updating the display of WebVTT text tracks.
When a track
element's parent element changes and
the new parent is a media element, then the user agent
must add the track
element's corresponding text
track to the media element's list of text
tracks.
When a track
element's parent element changes and
the old parent was a media element, then the user agent
must remove the track
element's corresponding
text track from the media element's
list of text tracks.
When a text track corresponding to a
track
element is added to a media
element's list of text tracks, the user agent
must set the text track mode appropriately, as
determined by the following conditions:
subtitles
or captions
and the user
has indicated an interest in having a track with this text
track kind, text track language, and
text track label enabled, and there is no other
text track in the media element's
list of text tracks with a text track
kind of either subtitles
or captions
whose
text track mode is showingdescriptions
and
the user has indicated an interest in having text descriptions with
this text track language and text track
label enabled, and there is no other text
track in the media element's list of
text tracks with a text track kind of descriptions
whose
text track mode is showingchapters
and the
text track language is one that the user agent has
reason to believe is appropriate for the user, and there is no
other text track in the media element's
list of text tracks with a text track
kind of chapters
whose
text track mode is showingLet the text track mode be showing.
If there is a text track in the media element's list of text tracks whose text track mode is showing by default, the user agent must furthermore change that text track's text track mode to hidden.
track
element has a default
attribute specified, and
there is no other text track in the media
element's list of text tracks whose
text track mode is showing or showing by defaultLet the text track mode be showing by default.
Let the text track mode be disabled.
When a text track corresponding to a
track
element is created with text track
mode set to hidden,
showing, or showing by default,
and when a text track corresponding to a
track
element is created with text track
mode set to disabled and subsequently changes its text
track mode to hidden,
showing, or showing by default for
the first time, the user agent must immediately and synchronously
run the following algorithm. This algorithm interacts closely with
the event loop mechanism; in particular, it has a
synchronous section (which is triggered as part of the
event loop algorithm). The step in that section is
marked with ⌛.
Set the text track readiness state to loading.
Asynchronously run the remaining steps, while continuing with whatever task was responsible for creating the text track or changing the text track mode.
Download: If URL is not the empty
string, and its origin is the same as the media
element's Document
's origin, then
fetch URL, from the media
element's Document
's origin, with
the force same-origin flag set.
The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser as it is received, with the text track list of cues being used for that parser's output.
If the fetching algorithm fails for
any reason (network error, the server returns an error code, a
cross-origin check fails, etc), or if URL is
the empty string or has the wrong origin as
determined by the condition at the start of this step, or if the
fetched resource is not in a supported format, then queue a
task to first change the text track readiness
state to failed to
load and then fire a simple event named error
at the track
element; and then, once that task is queued, move on to the step below labeled
monitoring.
If the fetching algorithm does not
fail, then, when it completes, queue a task to first
change the text track readiness state to loaded and then fire a
simple event named load
at
the track
element; and then, once that task is queued, move on to the step below labeled
monitoring.
If, while the fetching algorithm is active, either:
...then the user agent must run the following steps:
Abort the fetching algorithm.
Queue a task to fire a simple
event named abort
at
the track
element.
Let URL be the new track URL.
Jump back to the top of the step labeled download.
Until one of the above circumstances occurs, the user agent must remain on this step.
Monitoring: Wait until the track URL is no longer equal to URL, at the same time as the text track mode is set to hidden, showing, or showing by default.
Wait until the text track readiness state is no longer set to loading.
Await a stable state. The synchronous section consists of the following step. (The step in the synchronous section is marked with ⌛.)
⌛ Set the text track readiness state to loading.
End the synchronous section, continuing the remaining steps asynchronously.
Jump to the step labeled download.
tracks
. length
Returns the number of text tracks associated with the media element (e.g. from track
elements). This is the number of text tracks in the media element's list of text tracks.
tracks[
n ]
Returns the TextTrack
object representing the nth text track in the media element's list of text tracks.
track
Returns the TextTrack
object representing the track
element's text track.
The tracks
attribute of media elements must
return an array host object
for objects of type TextTrack
that is fixed
length and read only. The same object must be returned
each time the attribute is accessed. [WEBIDL]
The array must contain the TextTrack
objects of the
text tracks in the media
element's list of text tracks, in the same
order as in the list of text tracks.
interface TextTrack { readonly attribute DOMString kind; readonly attribute DOMString label; readonly attribute DOMString language; const unsigned short NONE = 0; const unsigned short LOADING = 1; const unsigned short LOADED = 2; const unsigned short ERROR = 3; readonly attribute unsigned short readyState; readonly attribute Function onload; readonly attribute Function onerror; const unsigned short OFF = 0; const unsigned short HIDDEN = 1; const unsigned short SHOWING = 2; attribute unsigned short mode; readonly attribute TextTrackCueList cues; readonly attribute TextTrackCueList activeCues; readonly attribute Function oncuechange; };
kind
Returns the text track kind string.
label
Returns the text track label.
language
Returns the text track language string.
readyState
Returns the text track readiness state, represented by a number from the following list:
TextTrack
. NONE
(0)The text track not loaded state.
TextTrack
. LOADING
(1)The text track loading state.
TextTrack
. LOADED
(2)The text track loaded state.
TextTrack
. ERROR
(3)The text track failed to load state.
mode
Returns the text track mode, represented by a number from the following list:
TextTrack
. OFF
(0)The text track disabled mode.
TextTrack
. HIDDEN
(1)The text track hidden mode.
TextTrack
. SHOWING
(2)The text track showing and showing by default modes.
Can be set, to change the mode.
cues
Returns the text track list of cues, as a TextTrackCueList
object.
activeCues
Returns the text track cues from the text track list of cues that are currently active (i.e. that start before the current playback position and end after it), as a TextTrackCueList
object.
The kind
attribute must return the text track kind of the
text track that the TextTrack
object
represents.
The label
attribute must return the text track label of the
text track that the TextTrack
object
represents.
The language
attribute must return the text track language of the
text track that the TextTrack
object
represents.
The readyState
attribute must return the numeric value corresponding to the
text track readiness state of the text
track that the TextTrack
object represents, as
defined by the following list:
NONE
(numeric value 0)LOADING
(numeric value 1)LOADED
(numeric value 2)ERROR
(numeric value 3)The mode
attribute, on getting, must return the numeric value corresponding
to the text track mode of the text track
that the TextTrack
object represents, as defined by
the following list:
OFF
(numeric value 0)HIDDEN
(numeric value 1)SHOWING
(numeric value 2)On setting, if the new value is not either 0, 1, or 2, the user
agent must throw an INVALID_ACCESS_ERR
exception. Otherwise, if the new value isn't equal to what the
attribute would currently return, the new value must be processed as
follows:
Set the text track mode of the text
track that the TextTrack
object represents to
the text track disabled mode.
Set the text track mode of the text
track that the TextTrack
object represents to
the text track hidden mode.
Set the text track mode of the text
track that the TextTrack
object represents to
the text track showing mode.
If the mode had been showing by default, this will change it
to showing, even though
the value of mode
would
appear not to change.
If the text track mode of the text
track that the TextTrack
object represents is
not the text track disabled mode, then the cues
attribute must
return a live TextTrackCueList
object
that represents the subset of the text track list of
cues of the text track that the
TextTrack
object represents whose start times occur before the
earliest possible position when the script started, in
text track cue order. Otherwise, it must return
null. When an object is returned, the same object must be returned
each time.
The earliest possible position when the script started is whatever the earliest possible position was the last time the event loop reached step 1.
If the text track mode of the text
track that the TextTrack
object represents is
not the text track disabled mode, then the activeCues
attribute must return a live
TextTrackCueList
object that represents the subset of
the text track list of cues of the text
track that the TextTrack
object represents
whose active flag was set when the script started, in
text track cue order. Otherwise, it must return
null. When an object is returned, the same object must be returned
each time.
A text track cue's active flag was set when the script started if its text track cue active flag was set the last time the event loop reached step 1.
interface MutableTextTrack : TextTrack { void addCue(in TextTrackCue cue); void removeCue(in TextTrackCue cue); };
addTrack
( kind [, label [, language ] ] )Creates and returns a new MutableTextTrack
object, which is also added to the media element's list of text tracks.
addCue
( cue )Adds the given cue to mutableTextTrack's text track list of cues.
Raises an exception if the argument is null, associated with another text track, or already in the list of cues.
removeCue
( cue )Removes the given cue from mutableTextTrack's text track list of cues.
Raises an exception if the argument is null, associated with another text track, or not in the list of cues.
The addTrack(kind, label, language)
method of media elements, when invoked, must run the following
steps:
If kind is not one of the following
strings, then throw a SYNTAX_ERR
exception and abort
these steps:
If the label argument was omitted, let label be the empty string.
If the language argument was omitted, let language be the empty string.
Create a new text track, and set its text track kind to kind, its text track label to label, its text track language to language, its text track readiness state to the text track loaded state, its text track mode to the text track hidden mode, and its text track list of cues to an empty list.
Add the new text track to the media element's list of text tracks.
The addCue(cue)
method of
MutableTextTrack
objects, when invoked, must run the
following steps:
If cue is null, then throw an
INVALID_ACCESS_ERR
exception and abort these
steps.
If the given cue is already associated
with a text track other than the method's
MutableTextTrack
object's text track,
then throw an INVALID_STATE_ERR
exception and abort
these steps.
Associate cue with the method's
MutableTextTrack
object's text track,
if it is not currently associated with a text
track.
If the given cue is already listed in
the method's MutableTextTrack
object's text
track's text track list of cues, then throw an
INVALID_STATE_ERR
exception.
Add cue to the method's
MutableTextTrack
object's text track's
text track list of cues.
The removeCue(cue)
method of
MutableTextTrack
objects, when invoked, must run the
following steps:
If cue is null, then throw an
INVALID_ACCESS_ERR
exception and abort these
steps.
If the given cue is not associated with
the method's MutableTextTrack
object's text
track, then throw an INVALID_STATE_ERR
exception.
If the given cue is not currently listed
in the method's MutableTextTrack
object's text
track's text track list of cues, then throw a
NOT_FOUND_ERR
exception.
Remove cue from the method's
MutableTextTrack
object's text track's
text track list of cues.
In this example, an audio
element is used to play a
specific sound-effect from a sound file containing many sound
effects. A cue is used to pause the audio, so that it ends exactly
at the end of the clip, even if the browser is busy running some
script. If the page had relied on script to pause the audio, then
the start of the next clip might be heard if the browser was not
able to run the script at the exact time specified.
var sfx = new Audio('sfx.wav'); var sounds = a.addTrack('metadata'); // add sounds we care about sounds.addCue(new TextTrackCue('dog bark', 12.783, 13.612, '', '', '', true)); sounds.addCue(new TextTrackCue('kitten mew', 13.612, 15.091, '', '', '', true)); function playSound(id) { sfx.currentTime = sounds.getCueById(id).startTime; sfx.play(); } sfx.oncanplaythrough = function () { playSound('dog bark'); } window.onbeforeunload = function () { playSound('kitten mew'); return 'Are you sure you want to leave this awesome page?'; }
interface TextTrackCueList { readonly attribute unsigned long length; getter TextTrackCue (in unsigned long index); TextTrackCue getCueById(in DOMString id); };
length
Returns the number of cues in the list.
Returns the text track cue with index index in the list. The cues are sorted in text track cue order.
getCueById
( id )Returns the first text track cue (in text track cue order) with text track cue identifier id.
Returns null if none of the cues have the given identifier or if the argument is the empty string.
A TextTrackCueList
object represents a dynamically
updating list of text track
cues in a given order.
The length
attribute must return the number of cues in the list represented by the
TextTrackCueList
object.
The supported property indicies of a
TextTrackCueList
object at any instant are the numbers
from zero to the number of cues
in the list represented by the TextTrackCueList
object
minus one, if any. If there are no cues in the list, there are no supported property
indicies.
To determine the value of an indexed property for a
given index index, the user agent must return
the indexth text track cue in the
list represented by the TextTrackCueList
object.
The getCueById(id)
method, when called with an argument
other than the empty string, must return the first text track
cue in the list represented by the
TextTrackCueList
object whose text track cue
identifier is id, if any, or null
otherwise. If the argument is the empty string, then the method must
return null.
interface TextTrackCue { readonly attribute TextTrack track; readonly attribute DOMString id; readonly attribute double startTime; readonly attribute double endTime; readonly attribute boolean pauseOnExit; DOMString getCueAsSource(); DocumentFragment getCueAsHTML(); readonly attribute Function onenter; readonly attribute Function onexit; };
Returns the TextTrack
object to which this
text track cue belongs, if any, or null
otherwise.
Returns the text track cue identifier.
Returns the text track cue start time, in seconds.
Returns the text track cue end time, in seconds.
Returns true if the text track cue pause-on-exit flag is set, false otherwise.
Returns the text track cue text in raw unparsed form.
Returns the text track cue text as a DocumentFragment
of HTML elements and other DOM nodes.
The track
attribute must return the TextTrack
object of the
text track with which the text track cue
that the TextTrackCue
object represents is associated,
if any; or null otherwise.
The id
attribute must return the text track cue identifier of
the text track cue that the TextTrackCue
object represents.
The startTime
attribute must return the text track cue start time of
the text track cue that the TextTrackCue
object represents, in seconds.
The endTime
attribute must return the text track cue end time of
the text track cue that the TextTrackCue
object represents, in seconds.
The pauseOnExit
attribute must return true if the text track cue
pause-on-exit flag of the text track cue that
the TextTrackCue
object represents is set; or false
otherwise.
The direction
attribute must return the text track cue writing
direction of the text track cue that the
TextTrackCue
object represents.
The getCueAsSource()
method must return the raw text track cue text.
The getCueAsHTML()
method must convert the text track cue text to a
DocumentFragment
for the media element's
Document
, using the appropriate rules for doing
so.
The following are the event handlers that must be
supported, as IDL attributes, by all objects implementing the
TextTrack
interface:
Event handler | Event handler event type |
---|---|
onload | load
|
onerror | error
|
oncuechange | cuechange
|
The following are the event handlers that must be
supported, as IDL attributes, by all objects implementing the
TextTrackCue
interface:
Event handler | Event handler event type |
---|---|
onenter | enter
|
onexit | exit
|
The controls
attribute is a boolean attribute. If present, it
indicates that the author has not provided a scripted controller and
would like the user agent to provide its own set of controls.
If the attribute is present, or if scripting is disabled for the media element, then the user agent should expose a user interface to the user. This user interface should include features to begin playback, pause playback, seek to an arbitrary position in the content (if the content supports arbitrary seeking), change the volume, change the display of closed captions or embedded sign-language tracks, select different audio tracks or turn on audio descriptions, and show the media content in manners more suitable to the user (e.g. full-screen video or in an independent resizable window). Other controls may also be made available.
Even when the attribute is absent, however, user agents may provide controls to affect playback of the media resource (e.g. play, pause, seeking, and volume controls), but such features should not interfere with the page's normal rendering. For example, such features could be exposed in the media element's context menu.
Where possible (specifically, for starting, stopping, pausing, and unpausing playback, for seeking, for listing, enabling, and disabling text tracks, and for muting or changing the volume of the audio), user interface features exposed by the user agent must be implemented in terms of the DOM API described above, so that, e.g., all the same events fire.
For the purposes of listing chapters in the media
resource, only text tracks
in the media element's list of text
tracks showing or
showing by
default and whose text track kind is chapters
should be used.
Each cue in such a text
track represents a chapter starting at the cue's start time. The name of
the chapter is the text track cue text, interpreted
literally.
The controls
IDL attribute must reflect the content attribute of the
same name.
volume
[ = value ]Returns the current playback volume, as a number in the range 0.0 to 1.0, where 0.0 is the quietest and 1.0 the loudest.
Can be set, to change the volume.
Throws an INDEX_SIZE_ERR
if the new value is not
in the range 0.0 .. 1.0.
muted
[ = value ]Returns true if audio is muted, overriding the volume
attribute, and false if the
volume
attribute is being
honored.
Can be set, to change whether the audio is muted or not.
The volume
attribute must return the playback volume of any audio portions of
the media element, in the range 0.0 (silent) to 1.0
(loudest). Initially, the volume must be 1.0, but user agents may
remember the last set value across sessions, on a per-site basis or
otherwise, so the volume may start at other values. On setting, if
the new value is in the range 0.0 to 1.0 inclusive, the attribute
must be set to the new value and the playback volume must be
correspondingly adjusted as soon as possible after setting the
attribute, with 0.0 being silent, and 1.0 being the loudest setting,
values in between increasing in loudness. The range need not be
linear. The loudest setting may be lower than the system's loudest
possible setting; for example the user could have set a maximum
volume. If the new value is outside the range 0.0 to 1.0 inclusive,
then, on setting, an INDEX_SIZE_ERR
exception must be
raised instead.
The muted
attribute must return true if the audio channels are muted and false
otherwise. Initially, the audio channels should not be muted
(false), but user agents may remember the last set value across
sessions, on a per-site basis or otherwise, so the muted state may
start as muted (true). On setting, the attribute must be set to the
new value; if the new value is true, audio playback for this
media resource must then be muted, and if false, audio
playback must then be enabled.
Whenever either the muted
or
volume
attributes are changed,
the user agent must queue a task to fire a simple
event named volumechange
at the media
element.
The audio
attribute on the video
element controls the default
state of the audio channel of the media resource,
potentially overriding user preferences.
The audio
attribute, if
specified, must have a value that is an unordered set of
unique space-separated tokens, which are ASCII
case-insensitive. The tokens must be from the following list
(currently, only one allowed token is defined):
muted
Causes the user agent to override the user's preferences, if any, and always default the video to muted.
A future version of this specification will probably introduce new values here, e.g. to control the default volume, or to select a default audio track.
When a video
element is created, if it has an audio
attribute specified, the user
agent must split the
attribute's value on spaces; if any of the tokens are an
ASCII case-insensitive match for the string muted
, the user agent
must then set the muted
attribute to true, overriding any user preference.
This attribute has no dynamic effect (it only controls the default state of the element).
This video (an advertisment) autoplays, but to avoid annoying users, it does so without sound, and allows the user to turn the sound on.
<video src="adverts.cgi?kind=video" controls autoplay loop audio=muted></video>
Objects implementing the TimeRanges
interface
represent a list of ranges (periods) of time.
interface TimeRanges { readonly attribute unsigned long length; double start(in unsigned long index); double end(in unsigned long index); };
length
Returns the number of ranges in the object.
start
(index)Returns the time for the start of the range with the given index.
Throws an INDEX_SIZE_ERR
if the index is out of range.
end
(index)Returns the time for the end of the range with the given index.
Throws an INDEX_SIZE_ERR
if the index is out of range.
The length
IDL attribute must return the number of ranges represented by the object.
The start(index)
method must return the position
of the start of the indexth range represented by
the object, in seconds measured from the start of the timeline that
the object covers.
The end(index)
method must return the position
of the end of the indexth range represented by
the object, in seconds measured from the start of the timeline that
the object covers.
These methods must raise INDEX_SIZE_ERR
exceptions
if called with an index argument greater than or
equal to the number of ranges represented by the object.
When a TimeRanges
object is said to be a
normalized TimeRanges
object, the ranges it
represents must obey the following criteria:
In other words, the ranges in such an object are ordered, don't overlap, aren't empty, and don't touch (adjacent ranges are folded into one bigger range).
The timelines used by the objects returned by the buffered
, seekable
and played
IDL attributes of media elements must be that element's
media timeline.
This section is non-normative.
The following events fire on media elements as part of the processing model described above:
Event name | Interface | Dispatched when... | Preconditions |
---|---|---|---|
loadstart
| Event
| The user agent begins looking for media data, as part of the resource selection algorithm. | networkState equals NETWORK_LOADING
|
progress
| Event
| The user agent is fetching media data. | networkState equals NETWORK_LOADING
|
suspend
| Event
| The user agent is intentionally not currently fetching media data, but does not have the entire media resource downloaded. | networkState equals NETWORK_IDLE
|
abort
| Event
| The user agent stops fetching the media data before it is completely downloaded, but not due to an error. | error is an object with the code MEDIA_ERR_ABORTED .
networkState equals either NETWORK_EMPTY or NETWORK_IDLE , depending on when the download was aborted.
|
error
| Event
| An error occurs while fetching the media data. | error is an object with the code MEDIA_ERR_NETWORK or higher.
networkState equals either NETWORK_EMPTY or NETWORK_IDLE , depending on when the download was aborted.
|
emptied
| Event
| A media element whose networkState was previously not in the NETWORK_EMPTY state has just switched to that state (either because of a fatal error during load that's about to be reported, or because the load() method was invoked while the resource selection algorithm was already running).
| networkState is NETWORK_EMPTY ; all the IDL attributes are in their initial states.
|
stalled
| Event
| The user agent is trying to fetch media data, but data is unexpectedly not forthcoming. | networkState is NETWORK_LOADING .
|
play
| Event
| Playback has begun. Fired after the play() method has returned, or when the autoplay attribute has caused playback to begin.
| paused is newly false.
|
pause
| Event
| Playback has been paused. Fired after the pause() method has returned.
| paused is newly true.
|
loadedmetadata
| Event
| The user agent has just determined the duration and dimensions of the media resource and the text tracks are ready. | readyState is newly equal to HAVE_METADATA or greater for the first time.
|
loadeddata
| Event
| The user agent can render the media data at the current playback position for the first time. | readyState newly increased to HAVE_CURRENT_DATA or greater for the first time.
|
waiting
| Event
| Playback has stopped because the next frame is not available, but the user agent expects that frame to become available in due course. | readyState is newly equal to or less than HAVE_CURRENT_DATA , and paused is false. Either seeking is true, or the current playback position is not contained in any of the ranges in buffered . It is possible for playback to stop for two other reasons without paused being false, but those two reasons do not fire this event: maybe playback ended, or playback stopped due to errors.
|
playing
| Event
| Playback has started. | readyState is newly equal to or greater than HAVE_FUTURE_DATA , paused is false, seeking is false, or the current playback position is contained in one of the ranges in buffered .
|
canplay
| Event
| The user agent can resume playback of the media data, but estimates that if playback were to be started now, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. | readyState newly increased to HAVE_FUTURE_DATA or greater.
|
canplaythrough
| Event
| The user agent estimates that if playback were to be started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering. | readyState is newly equal to HAVE_ENOUGH_DATA .
|
seeking
| Event
| The seeking IDL attribute changed to true and the seek operation is taking long enough that the user agent has time to fire the event.
| |
seeked
| Event
| The seeking IDL attribute changed to false.
| |
timeupdate
| Event
| The current playback position changed as part of normal playback or in an especially interesting way, for example discontinuously. | |
ended
| Event
| Playback has stopped because the end of the media resource was reached. | currentTime equals the end of the media resource; ended is true.
|
ratechange
| Event
| Either the defaultPlaybackRate or the playbackRate attribute has just been updated.
| |
durationchange
| Event
| The duration attribute has just been updated.
| |
volumechange
| Event
| Either the volume attribute or the muted attribute has changed. Fired after the relevant attribute's setter has returned.
|
The main security and privacy implications of the
video
and audio
elements come from the
ability to embed media cross-origin. There are two directions that
threats can flow: from hostile content to a victim page, and from a
hostile page to victim content.
If a victim page embeds hostile content, the threat is that the
content might contain scripted code that attempts to interact with
the Document
that embeds the content. To avoid this,
user agents must ensure that there is no access from the content to
the embedding page. In the case of media content that uses DOM
concepts, the embedded content must be treated as if it was in its
own unrelated top-level browsing context.
For instance, if an SVG animation was embedded in
a video
element, the user agent would not give it
access to the DOM of the outer page. From the perspective of scripts
in the SVG resource, the SVG file would appear to be in a lone
top-level browsing context with no parent.
If a hostile page embeds victim content, the threat is that the
embedding page could obtain information from the content that it
would not otherwise have access to. The API does expose some
information: the existence of the media, its type, its duration, its
size, and the performance characteristics of its host. Such
information is already potentially problematic, but in practice the
same information can more or less be obtained using the
img
element, and so it has been deemed acceptable.
However, significantly more sensitive information could be obtained if the user agent further exposes metadata within the content such as subtitles or chapter titles. This version of the API does not expose such information. Future extensions to this API will likely reuse a mechanism such as CORS to check that the embedded content's site has opted in to exposing such information. [CORS]
An attacker could trick a user running within a corporate network into visiting a site that attempts to load a video from a previously leaked location on the corporation's intranet. If such a video included confidential plans for a new product, then being able to read the subtitles would present a confidentiality breach.
This section is non-normative.
Playing audio and video resources on small devices such as
set-top boxes or mobile phones is often constrained by limited
hardware resources in the device. For example, a device might only
support three simultaneous videos. For this reason, it is a good
practice to release resources held by media elements when they are done playing, either by
being very careful about removing all references to the element and
allowing it to be garbage collected, or, even better, by removing
the element's src
attribute and
any source
element descendants, and invoking the
element's load()
method.
This section is non-normative.
How accurately various aspects of the media element API are implemented is considered a quality-of-implementation issue.
For example, when implementing the buffered
attribute, how precise
an implementation reports the ranges that have been buffered depends
on how carefully the user agent inspects the data. Since the API
reports ranges as times, but the data is obtained in byte streams, a
user agent receiving a variable-bit-rate stream might only be able
to determine precise times by actually decoding all of the data.
User agents aren't required to do this, however; they can instead
return estimates (e.g. based on the average bit rate seen so far)
which get revised as more information becomes available.
As a general rule, user agents are urged to be conservative rather than optimistic. For example, it would be bad to report that everything had been buffered when it had not.
Another quality-of-implementation issue would be playing a video backwards when the codec is designed only for forward playback (e.g. there aren't many key frames, and they are far apart, and the intervening frames only have deltas from the previous frame). User agents could do a poor job, e.g. only showing key frames; however, better implementations would do more work and thus do a better job, e.g. actually decoding parts of the video forwards, storing the complete frames, and then playing the frames backwards.
Similarly, while implementations are allowed to drop buffered data at any time (there is no requirement that a user agent keep all the media data obtained for the lifetime of the media element), it is again a quality of implementation issue: user agents with sufficient resources to keep all the data around are encouraged to do so, as this allows for a better user experience. For example, if the user is watching a live stream, a user agent could allow the user only to view the live video; however, a better user agent would buffer everything and allow the user to seek through the earlier material, pause it, play it forwards and backwards, etc.
When a media element that is paused is removed from a document and not reinserted before the next time the event loop spins, implementations that are resource constrained are encouraged to take that opportunity to release all hardware resources (like video planes, networking resources, and data buffers) used by the media element. (User agents still have to keep track of the playback position and so forth, though, in case playback is later restarted.)