usable in <audio> and <video> elements, via a new srcObj property
can be used as input to WebRTC connections, Web Audio API handling, recording API
Sources of MediaStream
from a WebRTC peer connection :)
from Web Audio API
from <audio>, <video>, and <canvas> elements
from screen sharing API
MediaStream semantics
a MediaStream can’t be seeked or paused — they're optimized for real-time media
MediaStream contains audio and video MediaStreamTracks
MediaStream tracks
Tracks in a MediaStream are kept in sync by the browser
a track is attached to a source (a media capture device for getUserMedia)
Capabilities
what mics / cams are capable of, expressed similarly to constraints
available via getCapabilities() method of MediaStreamTracks
list of capture devices available via navigator.mediaDevices.enumerateDevices()
GET https://example.org/chatrooms/foo
var signalingChannel = new WebSocket("wss://example.org/signaling");
document.getElementById("call").addEventListener("click", startCall);
var pc;
var startCall = function () {
pc = new RTCPeerConnection();
pc.createOffer().then(startSignalling);
};
var startSignalling = function (offer) {
pc.setLocalDescription(offer);
signalingChannel.send(offer);
};
Signaling
Session establishment, maintenance and control
Web server acts as a relay between peers
Any transport can be used for signaling (HTTP via XHR, WebSockets, …)
No specified signaling protocol (à la SIP or XMPP)
Signaling functions
Signals start / end of communication
Helps find optimal network route between peers
Brokers negotiations on formats and constraints
If both peers have public IP addresses, finding a direct path between them only requires exchanging these IP addresses
If one or both peers are behind firewalls or NAT, they don’t know how to reach one another:
they don’t know the public IP address their firewalls might allow them to be contacted through
they don’t know if the firewalls will allow direct media exchange
STUN
allows to discover one’s public IP address
determines if one is UDP-reachable at that address
If STUN works, then the peers will be able to exchange media directly via their discovered IP addresses
If STUN is not enough (e.g. UDP traffic is restricted), we have one last option: using an authorized relay for UPD traffic.
TURN
Server to relay real-time traffic
With authentication
Can be managed by network operator
Costly to operate, may impact quality
ICE
Protocol to determine best option among direct routes, STUN, TURN
Built-in in WebRTC
var pc = new RTCPeerConnection(
{ iceServers: [
{url: "stun:stun.example.com:19302"},
{url: "turns:turn.example.com:45522", username: "foo", credential: "bar"}
]
);
pc.addEventListener("icecandidate", sendOverSignalingChannel);
⚠ Private IP
ICE shares all IP addresses with the server
Might leak IP addresses expected to be private (e.g. VPN)
Ongoing discussion about mitigations
Understanding each other
Alice’s browser need to be able to play Bob’s media streams
and vice versa
i.e. they need to share media codecs
Codecs
browsers negotiate to find common codecs
for audio and for video
to ensure the negotiation succeeds, we need “mandatory to implement” (MTI) codecs
Audio MTI Codecs
G.711 (legacy compatibility)
OPUS (recent, RF, bit-rate adaptable)
Video Codecs
lengthy discussions
H264 (not RF, standardized)
vs VP8 (fuzzy RF status, not standardized)
browsers required to support both (?)
Negotiations
each browser describe its capabilites and preferences, based on
webcam / mic, hardware
application-specific aspects (e.g. number and type of streams)
Negotiations
send this description to the peer and search common ground
exchange of “offers” and “answers”
Javascript Session Establishment Protocol
SDP
Capabilities and preferences are described with “Session Description Protocol” (SDP)
text-based format for describing media streams and more
var pc = new RTCPeerConnection(iceConfiguration);
pc.createOffer().then(function (offer) {
// offer.sdp has SDP representation of preferences and capabilities
// in theory, the SDP can be tweaked here before being accepted
pc.setLocalDescription(offer);
// setLocalDescription also triggers ICE candidate gathering
// We send our offer to the peer
signalingChannel.send(offer);
});
pc.addEventListener("icecandidate", function (e) {
signalingChannel.send({type: "icecandidate", candidate: e.candidate});
});
addTrack
before an offer can be created, the browser needs to know what will be transmitted
we need to plug in the MediaStream we obtained from getUserMedia
webcamP.then(setupCall);
var setupCall = function(stream) {
pc.addTrack(stream.getAudioTracks()[0]);
pc.addTrack(stream.getVideoTracks()[0]);
// as tracks of a new type have been added, the browser knows
// negotiation is needed
// → "negotationneeded" event triggered
};
changing SDP before “installing it” is… not well defined
ORTC advocates getting rid of it
signalingChannel.addEventListener("message", function (e) {
if (!pc) {
pc = new RTCPeerConnection(iceConfiguration);
navigator.mediaDevices.getUserMedia({audio: true, video: true})
.then(function(stream) { pc.addTrack(…);});
}
var msg = JSON.parse(e.data);
switch(msg.type) {
case "icecandidate": pc.addIceCandidate(new RTCIceCandidate(msg.candidate)); break;
case "offer": pc.setRemoteDescription(msg)
.then(pc.createAnswer)
.then(pc.setLocalDescription)
.then(signalingChannel.send);
break;
case "answer": pc.setRemoteDescription(msg);
}
});
var remote = document.getElementById("remote");
pc.addEventListener("track", function (track) {
if (!remote.srcObj) {
remote.srcObj = new MediaStream();
}
remote.srcObj.addTrack(track);
});
Media transmission
RTP — real-time transport protocol
Encrypted → SRTP
Key exchange via DTLS — TLS adapted to non-TCP transport
var channel = pc.createDataChannel("filetransfer");
channel.addEventListener("open"), function () {
startFileTransfer();
});
var fileData = [];
channel.addEventListener("message"), function(e) {
fileData.push(e.data);
});
Data Channels
Similar to Web Sockets
Can be configured to be reliable or not, ordered or not
Challenging security-wise — at odds with same-origin policy
Can select sharing whole screen, app, window, or browser tab
<select class=audio></select>
var outputDevices = navigator.mediaDevices.enumerateDevices()
.filter(function(d) { return d.kind === "audiooutput";});
var audioSelectorUI = document.querySelector("select.audio");
outputDevices.forEach(function (o) {
var opt = document.createElement("option");
opt.appendChild(document.createTextNode(o.label));
document.querySelector("select.audio").appendChild(opt);
});
document.querySelector("select.audio").addEventListener("change", function (e) {
remote.setSinkId(outputDevices[this.selectedIndex].deviceId);
});
Trust model
by default, the Web server is assumed to be trusted:
to put you in touch with the right person
not to listen to, record or alter the media flow unexpectedly
Extra security
a WebRTC app can opt-in for extra-security:
Web server can’t interact with the media flow
peer is identified by third-party
navigator.mediaDevices.getUserMedia({video: true,
peerIdentity: "[email protected]"});
// the stream obtained from there can only be transmitted via
// a peer connection constructed with
var pc = new RTCConfiguration({peerIdentity: "[email protected]"});
// This assumes the browser has been configured to know how to contact
// the identity provider who can verify that Bob is the Bob Alice knows
// The app can also help:
pc.setIdentityProvider("example.net", "default", "alice");