Manipulating video stream components.
Modern web technologies provide ample ways to work with video. Media Stream API, Media Recording API, Media Source API, and WebRTC API add up to a rich tool set for recording, transferring, and playing video streams. While solving certain high-level tasks, these APIs don't let web programmers work with individual components of a video stream such as frames and unmuxed chunks of encoded video or audio. To get low-level access to these basic components, developers have been using WebAssembly to bring video and audio codecs into the browser. But given that modern browsers already ship with a variety of codecs (which are often accelerated by hardware), repackaging them as WebAssembly seems like a waste of human and computer resources.
WebCodecs API eliminates this inefficiency by giving programmers a way to use media components that are already present in the browser. Specifically:
- Video and audio decoders
- Video and audio encoders
- Raw video frames
- Image decoders
The WebCodecs API is useful for web applications that require full control over the way media content is processed, such as video editors, video conferencing, video streaming, etc.
Video processing workflow
Frames are the centerpiece in video processing. Thus in WebCodecs most classes either consume or produce frames. Video encoders convert frames into encoded chunks. Video decoders do the opposite.
Also VideoFrame
plays nicely with other Web APIs by being a CanvasImageSource
and having a constructor that accepts CanvasImageSource
.
So it can be used in functions like drawImage()
andtexImage2D()
. Also it can be constructed from canvases, bitmaps, video elements and other video frames.
WebCodecs API works well in tandem with the classes from Insertable Streams API which connect WebCodecs to media stream tracks.
MediaStreamTrackProcessor
breaks media tracks into individual frames.MediaStreamTrackGenerator
creates a media track from a stream of frames.
WebCodecs and web workers
By design WebCodecs API does all the heavy lifting asynchronously and off the main thread. But since frame and chunk callbacks can often be called multiple times a second, they might clutter the main thread and thus make the website less responsive. Therefore it is preferable to move handling of individual frames and encoded chunks into a web worker.
To help with that, ReadableStream
provides a convenient way to automatically transfer all frames coming from a media
track to the worker. For example, MediaStreamTrackProcessor
can be used to obtain a
ReadableStream
for a media stream track coming from the web camera. After that
the stream is transferred to a web worker where frames are read one by one and queued
into a VideoEncoder
.
With HTMLCanvasElement.transferControlToOffscreen
even rendering can be done off the main thread. But if all the high level tools turned
out to be inconvenient, VideoFrame
itself is transferable and may be
moved between workers.
WebCodecs in action
Encoding
It all starts with a VideoFrame
.
There are three ways to construct video frames.
From an image source like a canvas, an image bitmap, or a video element.
const canvas = document.createElement("canvas"); // Draw something on the canvas... const frameFromCanvas = new VideoFrame(canvas, { timestamp: 0 });
Use
MediaStreamTrackProcessor
to pull frames from aMediaStreamTrack
const stream = await navigator.mediaDevices.getUserMedia({…}); const track = stream.getTracks()[0]; const trackProcessor = new MediaStreamTrackProcessor(track); const reader = trackProcessor.readable.getReader(); while (true) { const result = await reader.read(); if (result.done) break; const frameFromCamera = result.value; }
Create a frame from its binary pixel representation in a
BufferSource
const pixelSize = 4; const init = { timestamp: 0, codedWidth: 320, codedHeight: 200, format: "RGBA", }; const data = new Uint8Array(init.codedWidth * init.codedHeight * pixelSize); for (let x = 0; x < init.codedWidth; x++) { for (let y = 0; y < init.codedHeight; y++) { const offset = (y * init.codedWidth + x) * pixelSize; data[offset] = 0x7f; // Red data[offset + 1] = 0xff; // Green data[offset + 2] = 0xd4; // Blue data[offset + 3] = 0x0ff; // Alpha } } const frame = new VideoFrame(data, init);
No matter where they are coming from, frames can be encoded into
EncodedVideoChunk
objects with a VideoEncoder
.
Before encoding, VideoEncoder
needs to be given two JavaScript objects:
- Init dictionary with two functions for handling encoded chunks and
errors. These functions are developer-defined and can't be changed after
they're passed to the
VideoEncoder
constructor. - Encoder configuration object, which contains parameters for the output
video stream. You can change these parameters later by calling
configure()
.
The configure()
method will throw NotSupportedError
if the config is not
supported by the browser. You are encouraged to call the static method
VideoEncoder.isConfigSupported()
with the config to check beforehand whether
the config is supported and wait for its promise.
const init = {
output: handleChunk,
error: (e) => {
console.log(e.message);
},
};
const config = {
codec: "vp8",
width: 640,
height: 480,
bitrate: 2_000_000, // 2 Mbps
framerate: 30,
};
const { supported } = await VideoEncoder.isConfigSupported(config);
if (supported) {
const encoder = new VideoEncoder(init);
encoder.configure(config);
} else {
// Try another config.
}
After the encoder has been set up, it's ready to accept frames via encode()
method.
Both configure()
and encode()
return immediately without waiting for the
actual work to complete. It allows several frames to queue for encoding at the
same time, while encodeQueueSize
shows how many requests are waiting in the queue
for previous encodes to finish.
Errors are reported either by immediately throwing an exception, in case the arguments
or the order of method calls violates the API contract, or by calling the error()
callback for problems encountered in the codec implementation.
If encoding completes successfully the output()
callback is called with a new encoded chunk as an argument.
Another important detail here is that frames need to be told when they are no
longer needed by calling close()
.
let frameCounter = 0;
const track = stream.getVideoTracks()[0];
const trackProcessor = new MediaStreamTrackProcessor(track);
const reader = trackProcessor.readable.getReader();
while (true) {
const result = await reader.read();
if (result.done) break;
const frame = result.value;
if (encoder.encodeQueueSize > 2) {
// Too many frames in flight, encoder is overwhelmed
// let's drop this frame.
frame.close();
} else {
frameCounter++;
const keyFrame = frameCounter % 150 == 0;
encoder.encode(frame, { keyFrame });
frame.close();
}
}
Finally it's time to finish encoding code by writing a function that handles chunks of encoded video as they come out of the encoder. Usually this function would be sending data chunks over the network or muxing them into a media container for storage.
function handleChunk(chunk, metadata) {
if (metadata.decoderConfig) {
// Decoder needs to be configured (or reconfigured) with new parameters
// when metadata has a new decoderConfig.
// Usually it happens in the beginning or when the encoder has a new
// codec specific binary configuration. (VideoDecoderConfig.description).
fetch("/upload_extra_data", {
method: "POST",
headers: { "Content-Type": "application/octet-stream" },
body: metadata.decoderConfig.description,
});
}
// actual bytes of encoded data
const chunkData = new Uint8Array(chunk.byteLength);
chunk.copyTo(chunkData);
fetch(`/upload_chunk?timestamp=${chunk.timestamp}&type=${chunk.type}`, {
method: "POST",
headers: { "Content-Type": "application/octet-stream" },
body: chunkData,
});
}
If at some point you'd need to make sure that all pending encoding requests have
been completed, you can call flush()
and wait for its promise.
await encoder.flush();
Decoding
Setting up a VideoDecoder
is similar to what's been done for the
VideoEncoder
: two functions are passed when the decoder is created, and codec
parameters are given to configure()
.
The set of codec parameters varies from codec to codec. For example H.264 codec
might need a binary blob
of AVCC, unless it's encoded in so called Annex B format (encoderConfig.avc = { format: "annexb" }
).
const init = {
output: handleFrame,
error: (e) => {
console.log(e.message);
},
};
const config = {
codec: "vp8",
codedWidth: 640,
codedHeight: 480,
};
const { supported } = await VideoDecoder.isConfigSupported(config);
if (supported) {
const decoder = new VideoDecoder(init);
decoder.configure(config);
} else {
// Try another config.
}
Once the decoder is initialized, you can start feeding it with EncodedVideoChunk
objects.
To create a chunk, you'll need:
- A
BufferSource
of encoded video data - the chunk's start timestamp in microseconds (media time of the first encoded frame in the chunk)
- the chunk's type, one of:
key
if the chunk can be decoded independently from previous chunksdelta
if the chunk can only be decoded after one or more previous chunks have been decoded
Also any chunks emitted by the encoder are ready for the decoder as is. All of the things said above about error reporting and the asynchronous nature of encoder's methods are equally true for decoders as well.
const responses = await downloadVideoChunksFromServer(timestamp);
for (let i = 0; i < responses.length; i++) {
const chunk = new EncodedVideoChunk({
timestamp: responses[i].timestamp,
type: responses[i].key ? "key" : "delta",
data: new Uint8Array(responses[i].body),
});
decoder.decode(chunk);
}
await decoder.flush();
Now it's time to show how a freshly decoded frame can be shown on the page. It's
better to make sure that the decoder output callback (handleFrame()
)
quickly returns. In the example below, it only adds a frame to the queue of
frames ready for rendering.
Rendering happens separately, and consists of two steps:
- Waiting for the right time to show the frame.
- Drawing the frame on the canvas.
Once a frame is no longer needed, call close()
to release underlying memory
before the garbage collector gets to it, this will reduce the average amount of
memory used by the web application.
const canvas = document.getElementById("canvas");
const ctx = canvas.getContext("2d");
let pendingFrames = [];
let underflow = true;
let baseTime = 0;
function handleFrame(frame) {
pendingFrames.push(frame);
if (underflow) setTimeout(renderFrame, 0);
}
function calculateTimeUntilNextFrame(timestamp) {
if (baseTime == 0) baseTime = performance.now();
let mediaTime = performance.now() - baseTime;
return Math.max(0, timestamp / 1000 - mediaTime);
}
async function renderFrame() {
underflow = pendingFrames.length == 0;
if (underflow) return;
const frame = pendingFrames.shift();
// Based on the frame's timestamp calculate how much of real time waiting
// is needed before showing the next frame.
const timeUntilNextFrame = calculateTimeUntilNextFrame(frame.timestamp);
await new Promise((r) => {
setTimeout(r, timeUntilNextFrame);
});
ctx.drawImage(frame, 0, 0);
frame.close();
// Immediately schedule rendering of the next frame
setTimeout(renderFrame, 0);
}
Dev Tips
Use the Media Panel in Chrome DevTools to view media logs and debug WebCodecs.
Demo
The demo below shows how animation frames from a canvas are:
- captured at 25fps into a
ReadableStream
byMediaStreamTrackProcessor
- transferred to a web worker
- encoded into H.264 video format
- decoded again into a sequence of video frames
- and rendered on the second canvas using
transferControlToOffscreen()
Other demos
Also check out our other demos:
Using the WebCodecs API
Feature detection
To check for WebCodecs support:
if ('VideoEncoder' in window) {
// WebCodecs API is supported.
}
Keep in mind that WebCodecs API is only available in secure contexts,
so detection will fail if self.isSecureContext
is false.
Feedback
The Chrome team wants to hear about your experiences with the WebCodecs API.
Tell us about the API design
Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue.
Report a problem with the implementation
Did you find a bug with Chrome's implementation? Or is the implementation
different from the spec? File a bug at new.crbug.com.
Be sure to include as much detail as you can, simple instructions for
reproducing, and enter Blink>Media>WebCodecs
in the Components box.
Glitch works great for sharing quick and easy repros.
Show support for the API
Are you planning to use the WebCodecs API? Your public support helps the Chrome team to prioritize features and shows other browser vendors how critical it is to support them.
Send emails to [email protected] or send a tweet
to @ChromiumDev using the hashtag
#WebCodecs
and let us know where and how you're using it.
Hero image by Denise Jans on Unsplash.