Conditional Focus

Abstract

When an application captures a [=display-surface=], the user agent faces a decision - should the captured [=display-surface=] be brought to the forefront of the user's screen ("focused"), or should the capturing application retain focus. This document proposes a mechanism by which an application can influence this decision.

The Conditional-Focus Mechanism

The conditional-focus mechanism allows the capturing application to instruct the user agent to either switch focus to the captured [=display-surface=], or to avoid such a focus change.

The window of opportunity for the application to make the decision is defined. If the mechanism is not invoked within this window of opportunity, the user agent takes over and makes its own decision.

getDisplayMedia() Extensions

{{MediaDevices/getDisplayMedia()}} is currently defined such that it returns a {{Promise}}<{{MediaStream}}>. We extend this definition such that when {{MediaDevices/getDisplayMedia()}} is called, if the user elects to capture either an [=application=], [=browser=] or [=window=] [=display-surface=], the video track of the aforementioned {{MediaStream}} will be of type {{FocusableMediaStreamTrack}}.

FocusableMediaStreamTrack

{{MediaStreamTrack}} is subclassed as {{FocusableMediaStreamTrack}}.

          [Exposed=Window]
          interface FocusableMediaStreamTrack : MediaStreamTrack {
            undefined focus(CaptureStartFocusBehavior focus_behavior);
          };

          enum CaptureStartFocusBehavior {
            "focus-captured-surface",
            "no-focus-change"
          };

focus()

Recall that the {{FocusableMediaStreamTrack}} object was instantiated in response to a call to {{MediaDevices/getDisplayMedia()}}. That call to {{MediaDevices/getDisplayMedia()}} returned a {{Promise}}<{{MediaStream}}> PRMS. Like any {{Promise}}, PRMS is settled on a microtask, which we will name MT.

When MT starts executing, a window of opportunity opens for the application to inform the user agent as to whether it wants the captured [=display-surface=] to be focused or not. Calls to {{focus()}} may only have an effect while this window of opportunity is open. It closes as soon as one of the following happens:

{{focus()}} is called for the first time.
MT finishes.
One second passes since the capture was started.

When the window of opportunity closes, if an explicit decision was not made through calling {{focus()}}, then the user agent MUST make its own decision.

Therefore, when {{focus()}} is called, the user agent MUST run the following steps:

If this object is a clone, raise an {{InvalidStateError}}. Otherwise, proceed.
If {{focus()}} was previously called on [=this=], raise an {{InvalidStateError}}. Otherwise, proceed.
If this call to {{focus()}} is not on MT, the user agent MUST have already made a decision, so raise an {{InvalidStateError}}. Otherwise, proceed.
If this call to {{focus()}} occurs more than one second after the start of the capture, the user agent MUST have already made a decision. The user agent MUST silently ignore this call {{focus()}}.
This call to {{focus()}} occurred on MT and within one second of the capture starting. Therefore, the user agent MUST NOT make its own decision with respect to focusing the captured [=display-surface=], but rather:
- If focus_behavior is set to {{CaptureStartFocusBehavior/"focus-captured-surface"}}, then the user agent MUST focus the captured [=display-surface=].
- If focus_behavior is set to {{CaptureStartFocusBehavior/"no-focus-change"}}, then the user agent MUST NOT focus the captured [=display-surface=].

Usage Samples

All examples will assume a predicate named shouldFocus() which accepts a video {{MediaStreamTrack}} as input. It is a synchronous function returning either {{CaptureStartFocusBehavior/"no-focus-change"}} or {{CaptureStartFocusBehavior/"focus-captured-surface"}}.

            function shouldFocus(mediaStreamTrack) {
              // Synchronous.
              // Returns "no-focus-change" or "focus-captured-surface".
              // Has access to Capture Handle.
            }

Reasonable implementations of this predicate include:

Hard-code to always focus.
Hard-code to never focus.
Base the decision on a user preference obtain in-app.
Base the decision on the captured [=display-surface=]'s {{DisplayCaptureSurfaceType}}.
Base the decision on the captured application (using Capture Handle).

Correct Usage Sample

            const mediaStream = await navigator.mediaDevices.getDisplayMedia();
            const [track] = mediaStream.getVideoTracks();
            if (!!track.focus) {
              track.focus(shouldFocus(track));  // Correct.
            }

Incorrect Usage Samples

              const mediaStream = await navigator.mediaDevices.getDisplayMedia();
              const [track] = mediaStream.getVideoTracks();
              await someOtherFunction();  // Mistake: Allows MT to finish its execution.
              if (!!track.focus) {
                track.focus(shouldFocus(track));
              }

              const mediaStream = await navigator.mediaDevices.getDisplayMedia();
              const [track] = mediaStream.getVideoTracks();
              setTimeout(() => {  // Mistake: Allows MT to finish its execution.
                if (!!track.focus) {
                  track.focus(shouldFocus(track));
                }
              }, 1);

              const mediaStream = await navigator.mediaDevices.getDisplayMedia();
              const [track] = mediaStream.getVideoTracks();
              timeConsumingFunc();  // Mistake: Might take longer than 1s.
              if (!!track.focus) {
                track.focus(shouldFocus(track));
              }