-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More Private enumerateDevices
#849
Comments
Does this also expose the absence/presence of an audio output device? If chromium based browsers were to stop exposing the presence of an audiooutput device, this would break quality of service checks and therefore block everyone from the entire video call flow on software I used to work on and I suspect many other similar applications. |
Thanks for filing this issue. We agree that WebRTC's distributed nature sometimes makes in-the-wild problems difficult to reproduce. However, this issue is scoped to the enumerateDevices API, which is a strictly local API where issues are much easier to reproduce. With regards to the past fingerprinting issues of enumerateDevices, Chrome has implemented all the PING-driven measures that were added to the spec to address those problems, except the requirement to gate enumerateDevices results on active capture. Chrome optimistically agreed to this measure, but unfortunately it broke existing applications when we tried to deploy it. Chrome instead gates enumerateDevices results on camera and microphone permissions, which we believe are sufficient to address the fingerprinting issues. Another enumerateDevices issue is support for the exposure of non-miked audio output devices when microphone exposure is allowed. This is necessary to support the common use case of a laptop user who uses non-miked headphones and the laptop microphone in VC calls. So we have filed w3c/mediacapture-main#1019 and w3c/mediacapture-output#147, which propose to change the content of the corresponding enumerateDevices tests, and so they should be resolved before adding these tests to the Interop list. We don't think either of those issues will regress the fingerprinting protection added by the overall PING-driven changes. |
Hi @lukewarlow, it's unclear whether you're asking about the PING review or Interop 2025. Audio output devices exposed through microphone permission appear alongside microphones, so they were categorically included in the PING review. But audio output is covered by separate WPT tests https://wpt.fyi/results/audio-output not presently included in this proposal for Interop 2025. I'm fine with leaving it that way to have time to resolve the details in w3c/mediacapture-output#147 (which is about what set of speakers the microphone permission exception applies to).
Does this software and the other "similar applications" work in Firefox and Safari? If not, they seem an example of the problem Interop 2025 exists to solve. |
I can only speak directly to the application that I worked on but that currently uses user agent detection to skip the audio output check (assume it passes) in Firefox and Safari. Hence it's only in this case a web compat risk for chromium to stop exposing them, rather than a direct interop issue. Though the status quo can lead to situations where firefox and WebKit users that don't have speakers have a degraded experience (can join a video call that would never work because they have no audio output). Having said all that assuming this interop proposal only covers microphone and camera and makes no decision one way or the other for audio output then that's okay for now but my concerns would apply to any decisions made about audio output exposure in future. |
To satisfy the proposal, browsers would need to limit microphone exposure ahead of gUM success. I can't speak to how an implementation that is already out of spec on the matter of speakers would treat speakers. |
That was 4 years ago. Today, both Safari and Firefox ship this measure. With those browsers already having taken the compatibility hit, what is holding Chrome back from fixing crbug 40138537 today? What websites do you remain concerned would still break? Wouldn't they only work in chromium browsers? Isn't that problematic? So far we've encountered zero breakage from Firefox 132 implementing this measure.
This isn't just about fingerprinting — which remains an issue: having scanned a QR code on a website last year shouldn't opt people into tracking — it's also about interoperability across permission models. What Chrome has implemented was never in the spec: near zero exposure without permission and full exposure with. The old spec had allowances that let websites build device pickers (albeit without labels) ahead of permission that sorta worked in all browsers. In contrast, what Chrome has implemented unilaterally has created a huge interoperability issue for other browsers where websites can reliably implement a device chooser on pageload (after priming for permission just once) only in chromium browsers. Maybe it's time to try again? |
We should continue this discussion in w3c/mediacapture-main#1019, where I'll leave a full reply, since this issue is just about the interop request, and Chrome has nothing to add about that request. The Interop guide explicitly says that Interop is not a venue for performing standards work. However, for people not interested in following w3c/mediacapture-main#1019, I just want to clarify here that Chrome did not unilaterally implement anything, much less something that was never in the spec. The actual story is:
As you can see, there is nothing unilateral here. As for the "device pickers (albeit without labels) ahead of permission that sorta worked in all browsers", no one ever built those pickers because they would have been useless for users, with zero human-readable information. Nothing broke when it became impossible to create those theoretical pickers, as they never existed. |
No standards work is being performed here. I'm asking what stands in the way of Chrome fixing a bug open for four years to adhere to the existing standard. The claim that websites would be broken seems unsupported given that Firefox and Safari have shipped this. No list of such websites has been produced.
That was on October 5th, 2020.
That was on October 15th, 2024, four years later. (6 days after this interop proposal). The actual timeline:
I don't think implementation experience that predates CR snapshot 3 years ago qualifies as new information for the WG to revisit this issue. |
The pickers required trial and error between "camera 1" and "camera 2", but worked across browsers (unlike what you suggested in the comment you retracted). Labels filled in after use. You might not have seen this being in Chrome. But yes, poor usability interop and fingerprinting concerns are why the WG moved away from eD-before-gUM. But the WG moved to gUM-before-eD, not eD-before-gUM-with-persistent-permission which Chrome invented. The WG would never have moved to the latter as it is not interoperable (doesn't work in Safari or Firefox by default).
So which websites still exist that rely on eD-before-gUM? |
Compatibility with existing applications. Risk of breaking compatibility with Most video conferencing sites is what prevents Chrome from fixing that "bug".
Again, in your own words, Most video conferencing sites did code similar to
You gave yourself the answer.
Yes. When we analyzed this interop proposal we realized that we had this issue with gum-before-eD and noticed that we had not filed a spec issue about it. Then we filed one so that we could discuss it in the WG and reference it here.
Just like Firefox did NOT implement the PING-driven changes for 5 years, until a couple of weeks ago.
The thinking at the time was that there was no reason to block it because we thought could redeploy the change. However, as you so eloquently stated years later, Most video conferencing sites are still written against permissions-before-eD-labels.
In the email thread we had with Youenn, I confirmed that we supported including "RTCRtpScriptTransform" and "RTCDataChannels transferable to workers". We had worked recently on these items and we did not have any blocking spec issue.
Why not? You confirmed that this was still a problem for Most video conferencing sites last year, and we have no reason to believe that the potential compatibility problems have gone away. |
No real VC application ever built those pickers because they would have been useless. If VC sites had done that, we would have seen major breakage when we implemented the PING-driven changes, and I do not recall a single bug filed related to that. And "retract" is not the appropriate term. I just simplified the comment with the information that is most relevant for this venue. I will add the more detailed content in our conversation in the WG issue.
What is eD-before-gUM? The model was permissions-before-eD-with-labels. And permissions often means a gUM call.
First off, whether permissions are persistent or ephemeral is outside the spec. Each UA is free to implement permission persistence in any way it prefers. The old spec had permissions-before-eD-with-labels. And Chrome reverted to it.
It was and still is Safari's model. It's just that Safari's model is compatible with both the old and new specs.
According to you (and also our observations), just last year it was Most video conferencing sites. I believe this is still the case. And the model is not eD-before-gUM. It is permissions-before-eD-with-labels. If most sites have actually moved to gUM-before-eD-with-labels, it will work fine with Chrome too, so Chrome is not preventing sites to move to the current spec, but Chrome will not break compatibility to try to force the move. |
Chrome has no plans to make incompatible changes in this area. |
Those would continue to detect Chrome's persistent permissions just fine. It's the correct way to query permission state. But what does that have to do with device enumeration? 🤔 Is your concern over websites that (wrongly) use device enumeration to query permission state? Something like:
|
A slightly more complete code sketch for those applications is: let perm = await navigator.permissions.query({name: "camera"});
if (perm.state == "prompt") {
// includes calling getUserMedia() and updating perm with the new permission state after the nag
nagTheUserAboutEnablingPermission();
}
if (perm.state == "granted") {
showFullApplicationIncludingPickers(); // includes calling eD()
} The case that you refer to as "smoother user experience to returning Chrome users" is when the permission has been persisted and the application goes directly to gUM-before-eD-with-labels breaks this use case, as
No, that is of no concern. |
You're mischaracterizing w3c/mediacapture-main#928, where "smoother user experience" refers to avoiding an extra click on a permission priming page ahead every meeting. That issue was not about eD-before-gUM. Firefox solved that issue in 132 by returning
False Equivalence. Safari and Firefox follow spec, Chrome does not.
This is the eD-before-gUM use case which the WG abandoned in 2020. The old spec allowed this (with labels in some browsers and initially without in others depending on permission). The spec since then does not.
I wasn't talking about eD-before-gUM in that issue (Whereby.com, the example in that issue, does not do eD-before-gUM). In hindsight I shouldn't even have said "most" as that issue turned out to be a lot smaller than expected. We've also worked with different services over the last year leading up to 132. I've tried all the major services, and haven't run into any problems yet. Most seem to have a lobby with a comb-check, and turn the camera on if they can. For there to be a problem, a video conferencing website would need to drop users into a meeting without camera and microphone on (e.g. based on a previous setting or maybe size of meeting), even though the user has granted persistent permission. This might be plausible, but seems a minor inconvenience. If you still think this is an issue on "Most video conferencing sites", can you give an example? |
If the application produces broken pickers for returning users, that is not a smooth user experience for returning users. It is in fact a broken user experience and is a blocker for Chrome.
I'm not making any equivalence. I'm saying the old spec had permissions-before-eD-with-labels and Chrome reverted back to that model after gUM-before-eD-with-labels broke applications. You were claiming that Chrome invented some new model, which is an incorrect statement. Chrome simply could not migrate from permissions-before-eD-with-labels to gUM-before-eD-with-labels because it breaks applications.
I just tried Zoom as a returning user on Firefox to see if I could get a smooth user experience.
This is not just plausible, but a common use case with Zoom, broken by gum-before-eD-with-labels.
Zoom. |
Not as a returning user. |
Select "Allow on every visit" on that dialog in your Chrome screenshot, and select "Remember for all cameras and microphones" for the equivalent dialog on Firefox so that both browsers provide a promptless experience for returning users. |
Whereby still requires an extra click on a priming page on Firefox. The step is for the gUM call prior to enumerateDevices, so that Whereby can create a proper UI with functional pickers. So, despite the fix in the permissions API, returning users still don't get the "smoother user experience". At this point I think it is clear that gUM-before-eD-with-labels introduces serious compatibility problems with existing applications and is not ready to be considered for Interop. I propose that we continue the discussion in w3c/mediacapture-main#1019, where I listed even more examples of breakage. |
Yes, as a returning user. Did you miss the "Allow this time" option?
That is false. Select "Allow this time" on that dialog in the Chrome screenshot. Close the tab, open a new one, and go to Zoom again. You'll see the "breakage" in Chrome again. You're falsely assuming every "returning user" is using persistent permission. What you call "breakage" is normal behavior for many users and they haven't complained. It's not even technically "breakage" because the websites are handling it, substituting an empty string for "Unrecognized microphone1" etc.
These arguments seem to rest on false equivalence: returning users shouldn't have to give up privacy by escalating their trust in the browser to be considered "returning users". |
I mean as a returning user that gave persistent permissions in order to have a smoother experience.
That is because you chose not to persist the permission. There is no expectation of a smoother experience in this case. The behavior should be the same as in Safari in this case.
The use case that breaks is the one of a user who gives a persistent permission and has an expectation of a smoother experience without prompts and with the UI working correctly. This use case breaks with gUM-before-eD and is a blocker for Chrome.
It's broken because users have no idea what camera or microphone to select. Choosing the wrong microphone can be a serious privacy issue. And it's a major regression if we make an update that breaks this important use case.
I'm using the term "returning users" only because you introduced it in this thread. Also, users are not giving up any privacy. They are giving persistent permissions to an application they trust. gUM-before-eD introduces privacy issues too. Consider the use case of user that trusts the application, gives it persistent permissions to avoid prompts, and configures the application to start/join meetings with the mic off. The only way to provide a proper UI with functional pickers is to open the microphone first, violating the user's privacy. The alternative is to provide a broken picker, but then the user can't select the right microphone, again with potential negative consequences for the user's privacy. So the argument of gUM-before-eD being categorically better for privacy is incorrect. |
If it's a serious privacy issue, it should be solved for all users, not just those who persist permission. This seems better left to apps to solve. Zoom gives no indication of which camera or microphone is used except for users who hit the little If they thought this was an issue, Zoom could simply call gUM when the user changes camera instead of waiting until the user unmutes themselves in the meeting (like they do for microphone). Problem solved for all users in all browsers. getUserMedia() also accepts cached deviceIds from localStorage, which apps can use to remember user settings to ensure users aren't surprised by which camera is used. This also works for all users. |
The problem would exist only for those who persist permission.
It is impossible to properly support the use case (promptless experience for users who trust the application) with gUM-before-eD.
I can't speak for Zoom developers or read their minds, so I don't know if this problem in Firefox is a high priority for them. |
It's not "most video conferencing", because they have lobbies so it's quite hard to get into a meeting without microphone (so they can implement "are you talking?") and camera (for self-view comb-check in the lobby). Zoom should be commended as an outlier for letting users in without accessing camera. This is great for users of one-time permission. I hope more websites adopt this model going forward. But it will be imperative to solve problems for all users of those sites. The spec is trying to not give preferential treatment to one set of users. The goal of persistent permission is to avoid prompts, not enable special device selection use flows that only work for half the users of one browser. |
Applications shouldn't call gUM if their users' preference is to keep the microphone and/or camera off. Applications should be able to implement this user journey without having to choose between a broken UI or violating users' privacy settings. gUM-before-eD makes this impossible.
That's up to Web site authors. For years Web sites have been able to provide a smoother user experience to users who trust to site. We shouldn't remove this choice just because we want to force gUM-before-eD.
Users with persistent permission on Chrome can choose the correct device because device information is available ahead of gUM for them.
You are describing the use case of users that have not given persistent permission (a prompt from gUM). The experience is less smooth for them, and that is expected. This use case is not broken by gum-before-eD.
The problem doesn't exist in practice for users without persistent permissions since they are prompted whenever a device is going to be opened.
Calling gUM when the user has told the application not to call gUM (i.e., keep the camera/mic off) would be a violation of the user's privacy settings. gUM-before-eD forces applications to do this if they want to show a correct UI for users with persisteent permissions, or it forces applications to show an incorrect UI if they want to respect the user's preference. permission-before-eD allows application to implement a correct UI without violating the user's privacy settings. |
It doesn't have to be "most" for the change to negatively affect many users.
I agree 100% that Zoom, or any other site, should be able to support the use case of not accessing the camera, including for users who trust the site with persistent permissions, without being forced to open the camera to provide a correct UI.
I am not sure the spec ever intended to forbid the use case of applications providing a smoother experience to users who persist permissions, but the case is that the old spec allows it and many sites provide it. Also, I don't think this discussion is productive anymore since we've reached a point in which we're just repeating the same arguments multiple times. |
Description
WebRTC is one of the most significant compatibility challenges on the modern web, and Mozilla's experience is that implementation differences in this area are a leading cause of breakage on top sites. The nature of the use cases means that in-the-wild problems are usually very difficult to reproduce, and therefore debug and fix reactively, so this area has even higher than normal dependence on good up-front interoperability.
The navigator.mediaDevices.enumerateDevices() API allows websites unprompted access to information about a user's cameras, microphones and speakers, which is a fingerprinting surface. The API is called by 7.3% of the web, compared with 0.2% for
getUserMediaPromise
which suggests this API is used extensively for tracking.A review by the Privacy Interest Group (PING) in 2020 tightened the spec to only reveal absence of camera or microphone to all sites, and to require active camera and microphone access (not just permission) for anything else.
Tests
https://wpt.fyi/results/mediacapture-streams?q=enumeratedevices
Specification
https://w3c.github.io/mediacapture-main/ https://w3c.github.io/mediacapture-output/
Additional Signals
No response
The text was updated successfully, but these errors were encountered: