<trevorfsmith> Hey, folks. We don't have wifi at the venue, yet, but we're working on it.
<trevorfsmith> We're doing a round of intros in the meantime.
<trevorfsmith> For some reason the WebEx chat isn't working.
<fernandojsg> trevorfsmith: I just wrote something and it seems to work on my side
<trevorfsmith> Hmmm. The third try worked. Strange.
<trevorfsmith> Ok, we have no wifi in the venue yet (😢) and the video stream isn't working. The Samsung site people are working on it, but it's a tough situation.
<fernandojsg> ok
<fernandojsg> I guess we will have audio at least right? :P
<trevorfsmith> We're setting up a WebEx audio channel.
<fernandojsg> cool, thanks!
<trevorfsmith> Ok, we're calling a 10 minute break while we attempt to get this sorted. Sorry, folks! We thought this was taken care of beforehand.
<cwilso> No audio on WebEx?
<fernandojsg> cwilso: is working already I can hear ada talking
<cwilso> Cool, thx
<dom> scribenick: johnpallett
<scribe> chair: trevorfsmith
<scribe> chair: cwilso
<scribe> chair: ada
<trevorfsmith> Ok, we're going to get started. The IRC channel is the place where notes will be taken and where you can send in questions. The Audio in WebEx should be working.
<trevorfsmith> Unfortunately, there's no video. 😭
<trevorfsmith> We'll link to the PRs and Issues in this channel so that remote folks can see what we're talking about.
nell: this issue is
long-standing; late 2018 bajones put forward a PR to expose
input information
... had some input about the problem space, now presenting an
alternate proposal
... framing of problem space
... when talking about motion controllers there are 5
things:
... (1) render it in the scene (2) get axis data from
controller (3) map the two first things together, i.e. render
like it is in the real world
... (4) render a legend around the motion controller to teach
users what to do
... (5) get events that can drive actions
... first proposal only addressed #1 and #2; thank you for
excellent feedback
bajones: going through proposal now... this is PR 462 and very recent PR 499
<dom> Custom interface for controller button/axis state (Variant A) #462
<dom> Gamepad based button/axis state (Variant B) #499
bajones: two variants of how to
expose more of the intrinsic state to developers. Started off
originally saying 'input is a messy thing, several APIs will
have conflicting ideas, maybe we can expose a single button'
but that had restricted utility and only worked for some use
cases.
... lots of helpful feedback saying we needed more
capabilities.
... e.g. need to get access to whole state of controller, e.g.
buttons and axis
... Also, what is the controller that the user is holding? This
is important so that rendering the virtual controller looks
like the actual controller being held.
... That's extra-important for tutorials where you need to
point at specific buttons and say what they do
... ended up with a unique ID that specifies which controller
type the user is holding
... e.g. "Oculus-Touch" or (per Alex) something a little more
mechanical such as USBVendor+ProductID - not sure yet if this
works for Android+Bluetooth and certain other platforms
... but otherwise, ideally it's not something that everyone has
to hardcode themselves.
... A few caveats - (1) no handedness included, since there's
already data on the controller that says this
... e.g. oculus touch controllers have handedness
... And (2) this is explicitly carved out for privacy-sensitive
situations, where the UA can report "unknown"
... e.g. the UA's privacy policy says you can't provide the
controller ID, and then the site needs to render something
generic
... we're trying to avoid enabling developers to filter out
users based on input type, i.e. "If your controller isn't a
Vive, you can't access this site"
... but it's probably a good idea to use this ID as a basis for
a lot more mapping... and ideally there's a community-driven,
open database that allows developers to fall back to a database
of controller IDs and types instead of hard-coding into the
site or the UA
... proposal breaks into to parts regarding how we expose axis
and button state
<NellWaliczek> The renderid proposal part is here: https://github.com/immersive-web/webxr/pull/479
bajones: They are labelled "Variant A" and "Variant B" in the PR
<dom> Add renderId to XRInputSource #479
bajones: tried previously to do
something based on the GamePad API, where we'd inject XR-style
gamepads into the gamepad array
... it was weird, and there wasn't a proper mapping, and
because of that there would be cases where developers would
improperly mask out gamepads because they weren't identified
properly
... so instead we're inventing a new thing that looks kind of
like GamePad but is a little more specifically structured
towards what we're trying to do directly
... [in PR see interfaceXRTrackedControllerState and
XRTrackedController for more details]
... this is a more attractive option because it's
purpose-built, it's clearly XR-centric - makes types of
controllers easier, e.g. triggers, touchpads, joysticks
<dom> [reviewing IDL at https://github.com/immersive-web/webxr/pull/462/files#diff-6ea1f8ee087a12d7d770e854f7dbadb7R428]
bajones: also allows the group to extend if we get new requirements
[reviewing IDL at https://github.com/immersive-web/webxr/pull/462/files#diff-6ea1f8ee087a12d7d770e854f7dbadb7R428]
bajones: in
XRTrackedControllerState there is also a name string; it's a
localized string so that it can work across languages if the
developer puts it on-screen
... we don't love the interface name. "Input" is overloaded,
also can become XRInputInputtyInput if we keep using it... but
in the process of talking through the proposal we reviewed the
GamePad API, it's already shipping in browsers and the language
already exists.
<dom> [reviewing https://github.com/immersive-web/webxr/pull/499/files#diff-6ea1f8ee087a12d7d770e854f7dbadb7R416]
bajones: When we reviewed Gamepad API structure, and GamepadButton API - includes things like 'connected' which we'd ignore, and timestamps which we don't need, and mapping which could be reused to communicate that we're using an XR standard
<dom> [reviewing mapping at https://github.com/immersive-web/webxr/pull/499/files#diff-6ea1f8ee087a12d7d770e854f7dbadb7R344]
bajones: When using the mapping
value xr-standard, buttons[0] would be primary trigger,
buttons[1] is always touchpad/joystick click, etc. (see
PR)
... so we could take all the common elements that we expose and
then, rather than putting the gamepad into the gamepad array
and doing a weird mapping, instead we'd put a gamepad source
into the XR input sources. It wouldn't show up in the
traditional gamepad array and that'd only be used for
traditional game console-style controllers.
... upside of this approach - more compatible with gamepads.
Downside - more documentation required for how mappings should
be interpreted.
<Zakim> ada, you wanted to show how to add a message
bajones: we're going to have to a lot of that work in either case, so might be in our benefit to rely on work already done w.r.t. the gamepad API
nell: two related thoughts.
First, I'm inclined to not hold up 'finishing' WebXR to create
a brand new design for something that solves many of the
gamepad problems.
... one opportunity with this approach is that we can work on
improving gamepad, and decouples the spec work for XR from
detailed input spec work
... tough to re-invent gamepad without all the gamepad people
involved (though bajones is also involved with gamepad
API)
... so that's the second thought, it'd be ideal if we can
separate XR from detailed input work
... that's going on already with gamepad
bajones: after nell gives her section, let's do a strawpoll with people in the room, on IRC on whether the custom gamepad idea is good or not
<klausw> cswilso: wait for Nell
nell: [going through some more
details on spec design] - recapped 5 ideas above on gamepad
requirements
... 1. draw it. 2. when are buttons pressed, axis, 3. visualize
data from #2, 4. legend for how to use controller, 5.
eventing
... focusing on data sources, mapping back to visuals, and
labelling for a minute (items #2-#4)
... web has rich tradition of things starting as open source
projects
... when thinking about how to animate models, a user agent is
not going to deliver a 3D model file for the controller - not
their business, and they're big binaries, and it's not clear
how to rig the model
... so what about a schema where for a given gamepad ID, what
if we could define a mapping that broke down expectations for
that model
... then we put those expectations in an open source public
repo next to a 3D model showing how to rig that model based on
those expectations
webex: fail. then recovers. apologies.
nell: ideally then vendors can
contribute their own models and schemas for how the models
should be represented.
... now talking through schema and how it should be represented
(this is brand new)
... [referencing branch at: webxr repo -> branches ->
gamepad mapping schema] - this isn't a diff, they are brand new
files
<dom> gamepad-mapping-schema branch
nell: referencing:
https://github.com/immersive-web/webxr/tree/gamepad-mapping-schema
gamepad-mapping-schema branch
... talking through schema in the context of the schema (JSON)
files for a few controllers
<dom> oculus touch description
nell: thanks dom
... example:
https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/Oculus-Touch.json
oculus touch description
... id tells you what the controller is
... some of the names aren't ideal, ignore them for now.
<dom> [reviewing https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/Oculus-Touch.json#L71]
nell: looking at the physical
building blocks of motion controllers: thumb sticks, D-pads,
buttons (analog or digital, touch or not), and ...
... buttons: analog, digital, touch or not... but they're
pressed, or not.
... thumbsticks: have a direction, and a pressed state
... touchpads provide X/Y data which is where your finger is
on
<dom> [reviewing https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/045E-065D.json#L79]
nell: touchpads might also have
edge or center-press buttons; almost D-pad like where you can
put your finger on the diagonal
... note that that has rendering consequences, but this file
breaks apart rendering (animation) from data sources
... note that schema file follows gltf conventions - could
switch to ids later to match
... then in each element, 'gamepadAxisIndex" is how you map
back to the gamepad mapping
... ... where this gets interesting is where we look at
responses section
<dom> [reviewing https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/045E-065D.json#L100]
nell: it's another array of
chunks with a set of response types. These are the pieces of
the model that contribute data to decide how the model is
deformed
... for example, if you touch the thumbstick it deforms the
thumbstick parts that move, relative to the parts that don't;
the value of each x/y axis and button value, to create a
combined transform that should be applied to the
thumbstick.
... the maker of the model file should put nodes that have no
model data, just used for transforms.
... then the schema defines which node in the tree should have
the transform applied to it
... there isn't a dedicated thumbstick press item since you
need to touch it to press it
... touchpad 'touch' moves a dot indicating where it'd have to
be on the model.
... extents show bounds. Don't need to know anything about the
model file itself, schema maps to the file directly.
<dom> [reviewing https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/045E-065D.json#L17]
nell: wanted to make sure we were
as flexible as possible
... in our example, data source 3 is the touchpad - the
'labelTransformId' is what you use to indicate the safe place
on the model for a label
... that's the components that say what makes up the
visualization of the controller
... if you look at the oculus-Touch file
<dom> [reviewing https://github.com/immersive-web/webxr/blob/gamepad-mapping-schema/gamepad-descriptions/Oculus-Touch.json#L5]
nell: there's also a 'hands'
section which gives you a connection between the visualization
of hands with the controller IDs and how they should be
connected
... note that the left and right hands aren't the same in this
case since the controller isn't symmetrical
... looking at primaryAxes, there's typically a default
suggestion from manufacturers as to what the default button
should be. This is still hand-wavy
... in general this approach could work with either of the
proposed Variant A or Variant B
<Leonard> +q
nell: there's also one more top-level thing not in this example which is user agent overrides. This section would allow developers to avoid manual hacking in around particular names; would give an escape hatch around bugs in the browser.
<dom> Schema explanation
bajones: So this is a lot. The
intent is to NOT require the use of XR controllers in the
browser, rather it's a way to provide a more robust,
professional experience to web applications that want it.
... if you're just building a video player you can probably
ignore all of this, just render a remote control in the user's
hand.
... you can build something consistent and reusable without
diving into the deep end.
... but we know there are developers and users out there that
want to provide the best experience for developers. This is an
attempt to provide that without being a nightmare of encoding
for everyone involved.
nell: related to that, if it's in an open source repo then proposed models could be ingested and tested online so that nodes are hooked up correctly
josh@mozilla: to clarify, this isn't a spec, this is something you'd bundle into your application, right? (nell: yes)
josh: I like the idea of reusing
old APIs. The problem I see is the advantage/disadvantage is
that it's tightly coupled to the gamepad API and associated
bugs... generally the implementations are buggy and
inconsistent.
... concerned about coupling to something outside our
control
nell: me too. As part of the
schema, X-axis could have a left and right value assigned to
it, so that the file says what range to use instead of the
gamepad API to override bugs
... but if a UA has a failure to comply then that's where the
UA-override section would apply.
... one thing that's flakey with gamepad is that getGamePads
call themselves. That wouldn't happen here since they're being
returned from XR input sources
bajones: and, as an implementer
of gamepad in Chrome.... I'm personally deeply sorry.
... and please log bugs!
... for this spec, though, we can say 'we're using the API in
this way, and when we do these restrictions apply'
... but there will always be bugs, we should fix them when they
come up. In general, though, I agree with your concern and I
believe we have mechanisms to address issues.
klaus_@_google: this proposal defines physical properties of the input device - question, it's common in some cases for buttons to be combined into quadrants
klaus: is the idea here that the community could help define nuances and different use cases and mappings?
nell: there's a lot of opportunity for improved API shape; community overriding buttons might be valid, but we're not sure yet, it's early. We can iterate on schema later. Not trying to pack all the solutions into the first iteration.
klaus: Do we want mapping to allow updates later?
bajones: it's not novel to have
an action mapping approach; from my perspective it feels like a
spec rabbit hole where we're trying to ship something.
... doesn't personally feel like it's the right approach for v1
of the API to have complex mappings. Could in theory have JS
libraries that provide mapping capabilities in the future;
maybe something at the UA level for mappings in the
future.
... e.g. VR API can give similar functionality by scheme. We'd
have opportunities to make those more nuanced at the UA level
without doing a web API
... but may have action mapping API later
<Zakim> ada, you wanted to ask about l3/r3 style buttons
ada@samsung : wanted feedback about touch on joystick -- could be useful for controllers for L3/R3, was trying to see if that was something useful
<Zakim> johnpallett, you wanted to ask whether specific types of inputs are necessary or whether generalized axis/input data is enough
<ada> JohnPallet, Google: Were alternatives considered for the schema implementation?
johnpallett: specifically, a more generalized axis/button approach that didn't rely on specific D-pad, Touchpad, Thubmbsticks?
nell: one point of clarification,
hand input wouldn't be part of this, e.g. a knuckles controller
that has buttons and joint controllers would have multiple
inputs
... we can rev the schema to support new types of inputs - this
is a library in the proposal - e.g. some new 'D-pad'
... can support fallbacks in the future if a UA is behind the
times. For right now, we did an inventory of motion controller
hardware that I'm aware of to make sure all known components
are covered.
johnpallett: clarification: but do you need defined types? can you just use axis/buttons or is there a reason for 'd-pad'?
nell: not sure yet. Was thinking about things that contribute to motion. It's possible that could encapsute all data source types into a single model for motion and inputs
<klausw> ^^^ clarification for 10:58 "combined into quadrants" - a trackpad may be treated as four buttons by splitting it into quadrants, for controllers such as the Vive wands which don't have many buttons. Question was if this is up to app/library, or if we want to expose a mapping layer at a lower level.
nell: this current revision was a way of thinking through ways of combining ideas... will continue to explore
bajones: also, this isn't tied to a particular file format
nell: used the term 'node' - we need to understand the dependency
leonard: likes the idea of not
holding back XR 'version 1' to get this all correct
... curious about capturing motion of the entire controller, or
detailed press like a Wacom tablet, will these be
considered?
bajones: actual motion (e.g.
velocity, acceleration) - because we're not trying to add onto
the gamepad API (unlike previous, not good versions) - any sort
of pose or velocity information would come from the input
source and space attached to that.
... we have an xrPose and there's a transform, and that's where
we could slot in acceleration and velocity. We haven't right
now because it's not clear whether can get a consistent signal
from the native APIs.
... but we have a way to scale up to that if necessary.
... larger-canvas tracking might require more axes in the
gamepad API, not sure about ways of doing multi-touch input. No
great answer at this time, may require something other than the
gamepad API
leonard: not clear on future-proofing, though?
bajones: could extend default mappings that we have for gamepad API. Likely evolution: Start with a gamepad approach, the move to an action-based approach.
<Zakim> alexturn, you wanted to discuss semantic clustering of the button mappings vs. explicit trigger/touchpad/joystick/grip mappings
alex turner_@_microsoft : This feels like layer A / layer B where things exist at the app layer, and there's a question about how things could get built on.
alexturner: Need to figure out
how much of this needs to appear in the standard itself vs. an
external thing.
... proposal may be mixing two separable things, (1) do we get
shot down if we do a new API? and (2) do we have explicit
grip/trigger/other types of inputs? Are these two design
principles that could be separated?
... could be some benefits to relying on a different approach
to semantic clustering
... and, what if someone uses button[0] and assumes it's
reliable in a particular way, and then ignore voice-based or
hand-based input
... could be subtle accidents as well - feeding data from one
type of input to another type by accident (e.g. default at 0,
vs not) - wonder if this next layer of abstractions will
help?
bajones: going with gamepads
forces us into semantic clustering, you get arrays and indices
and that's it.
... that's why we want this more detailed approach of
mapping
... this isn't the final say of how we expose input through the
API. It's intended to be 'most impact, least effort' to ship
API and then we'll have to layer more data on top of the API
later.
alexturner: I think we can still separate them. Think of gamepad inputs, if you had same degree of explicitness you'd have more empty slots but you'd have extended IDs and other mechanisms for dealing with explicit types.
cwilso: observation, if we are
going to rely on gamepad it seems like it'd behoove us to take
a stronger hand in developing the gamepad API
... right now that connection is <fanfare> bajones who is
an editor there
nell: At TPAC the chairs were asked whether the gamepad API should become part of XR
cwilso: gaming on the web is the other use case outside of XR. Gaming on the web doesn't have a group working on it.
nell: my answer at TPAC was -
let's prove we can ship one spec, then consider a 2nd
one.
... but yes we definitely should invest in gamepad API over
time
bajones: Let's do a strawpoll,
this is non-binding
... do you have an opinion on whether we should go with (1)
custom button/axis solution, or (2) ride on gamepad
coattails?
nell: (note that question about the schema is orthogonal, ignore that for noow)
<ada> stawpoll imminent: If you are not in the room +1 for should we do a schema or +2 for gamepad
<albertoelias> +2
bajones: Type +1 for custom solution, +2 for gamepad solution
<lgombos> +2
room: 1 vote for custom solution (#1), many many votes for gamepad solution (#2)
<albertoelias> I think we should aim for the simplest route to get the spec out there, but that gives developers access to all the underlying details controllers provide. We can then aim for nicer APIs also looking at what kinds of things libraries do
(thank you IRC participants for voting, overwhelming for #2)
<ada> Ada would like to change her vote for the gamepad solution, so it's 100% support
thanks alberto, read your note to the room
<dom> Does originOffset behave differently for identity reference space? #477
ada: https://github.com/immersive-web/webxr/issues/477 Does originOffset behave differently for identity reference space? #477
bajones: there's ambiguity in the
text surrounding the originOffset for reference spaces
... it's designed to let developers say what origin all poses
should be relative to. Useful for touch scrolling on inline
videos, etc.
... but there's ambiguity in how it's described which could be
one of two things. Purpose here is to pick which one is
best.
... by default, origin of virtual space is on floor in center
of room
... first way of thinking about origin offsets is that you take
origin of physical room and shifted where it appears in virtual
space, in the image it's Z=-3
... and X=1. So the whole room moves by X+1 and Z-3
... this was the intent but it's easily interpreted as option
2, where I apply the same offset but it applies it to the
origin of the VIRTUAL world instead
... so it's either offsetting the origin of the real world, or
the virtual world. Either is valid.
<alexturn> +q to discuss if it's well-formed to reason about the relationship between two XRSpaces if you offset both of them
<alexturn> +q to discuss if it's well-formed to reason about the relationship between two XRSpaces if you offset both of them
<Zakim> alexturn, you wanted to discuss if it's well-formed to reason about the relationship between two XRSpaces if you offset both of them
bajones: strawpoll, should origin
offsets be applied to the tracking origin (i.e. conceptually to
the physical world) - A
... or should it apply to the virtual scene's origin - B
nick_@_8thwall: it's common to see this in community apps where the user wants to position the camera in the scene they're creating
nick: so if you're starting at
t=0 you want to start at a position in the scene.
... that's consistent with A. What I'm stuck on with B is
directionality?
<Leonard> +q
bajones: order of operations is spec'd out in WebXR and should be unambiguous but...
<fernandojsg> Hirokazu_: could you please mute yourself on webex? I'm hearing some background noise coming from your mic
bajones: would expect room to
rotate around new origin in A
... but in B... would need to think about this a bit more. This
might be an argument for A?
alexturner: one argument for
option B, what we're offsetting is moving something in it's
natural coordinate system. So it's fairly unambiguous to
describe coordinates as poses in the natural space, and then
you're just specifying the origin.
... I find it easier to offset spaces relative to each other;
this means moving the virtual space relative to a real-world
origin.
... this means I can multiply in the offset relative to poses
that I get back. Using B it's easy to address multiple offsets,
but with A it's harder.
bajones: the math is unambiguous, so it's really a question about which makes more sense.
leonard: does this apply to AR?
bajones: both in VR and AR you'd be adjusting virtual content and adjusting to physical world in some capacity, which we choose doesn't affect the math but might change usage and developer understanding
<alexturn> +q for alexis to ask a question
bajones: note that everyone's going to experiment anyway :)
brendan @ apple : we flipped the trackpad axis to move towards a more direct metaphor for input
brendan: where people had a mouse
wheel rather than directly manipulating what user was working
on
... since we don't have a point of view in the XR spec, it
means it probably doesn't matter as much
<Zakim> johnpallett, you wanted to clarify understanding of Alex's point
josh@MOZ: Feels like the scrolling problem which dates to the late 70s so they just picked one
alexturner: Option B is about having a natural originn, and then you're moving virtual origins relative to it
<Zakim> alexturn, you wanted to discuss alexis to ask a question
johnpallett: is working on clarifying this for the IRC. :)
bajones: if you think about this
as window positioning - it's totally natural to position a
single window relative to my natural space
... and that matches AR as well
... but if you're fully immersed in the window, though, it
starts to feel weird and backwards.
... so either will feel wrong in some circumstances
... So, strawpoll time!
... IRC strawpoll - issue 477
<ada> vote with +a or +b
bajones: If you like Interpretation in issue 477 please put +a
johnpallett: (commentary: this is offsetting the physical, natural origin relative to virtual space)
bajones: If you like Interpretation B please put +b
johnpallett: 7 in the room vote
for A
... 12+ people in room vote for B
<fernandojsg> dom: could you please mute the people on webex? someone is snoring and I can't hear you correctly -_-
bajones: OK - there's no perfect solution, going with B for now. Thanks everyone.
<Manishearth> scribe: Manish Goregaokar
<Manishearth> scribenick: Manishearth
<josh_marinacci> blerh
<scribe> chair: trevorfsmith
<scribe> chair: cwilso
<scribe> chair: ada
<dom> Allow session creation to be blocked on required features #423
<dom> Define how to request features which require user consent #424
<dom> Added session feature handling to the explainer. #433
NellWaliczek: first a recap; this has been dormant for a bit
bajones: so for a little while
there was this sense that we really wanted to have a way to do
permissions within the api that tried to alleviate permissions
dialog fatigure for users
... esp in AR use cases if you want to share environment
information that's very privacy sensitive
... you need user consent, especially for cameras, environment
objects, etc
... we were envisioning all these apis where you need to be
able to request consent for various things but we don't want
them to show up as modals
... there was some r&d saying we could ask for a bundle of
perms at session creation time up front, which can show up as a
single modal
... we've gone back and forth on that, we haven't committed
yet, bc this would be quite a different pattern from what
happens on the web today
... it's unlikely that we as a tech focused WG will come up
with a solution for permissions fatigue on a whim
... so it feels weird to just say we wish to inject a new
security model in our api and run with it
... there was a very direct conversation with NellWaliczek
about what we actually need this for today (e.g. camera
permissions)
... we prob shouldn't design an api we don't have many uses for
... yet
... this has led us to a point where we're looking for the
required and desired features list, so it's kind of on hold on
the editor's mind
NellWaliczek: the required and
desired list ask comes from the permissions side but also
because we don't want to spin up sessions when we will shut
them down immediately
... things we're thinking of are: camera access,
geolocation/orientation, spacial tracking stuff
<johnpallett> +q to ask whether the privacy & security explainer has been used as an input to the session creation conversationon
NellWaliczek: our goal is not to
come up with a design, but to ask if the first version of webxr
should attempt to unify such a model
... want to make sure we're not overengineering a solution; see
if we can collab with privacy group (etc)
... i'd like to open the floor for thoughts on this
cabanier: we talked about this
before when we thought y'all were going to looki nto the
permissions api
... initially thought we could just inherit from the
permissions of the origin -- why can't we do that?
NellWaliczek: we may have looked into it at the time but i can't recall if we concluded on something. but also we need to address the other side of avoiding session creation when perms aren't available
<max> Manishearth - you can use ... if the same person keeps talking while scribing
max: thanks
oops
NellWaliczek: the only thing today that's qualifying is the spacial tracking thing
<dom> -> https://wicg.github.io/permissions-request/#api Permission Request API (split off from https://w3c.github.io/permissions/)
NellWaliczek: we can have it be an additive thing for now
<trevorfsmith> Nell and Rik are discussion that there are aspects of checking that hw can support a feature that are different than checking whether the user gives permissions.
<trevorfsmith> Nell and Rik are discussing that there are aspects of checking that hw can support a feature that are different than checking whether the user gives permissions.
NellWaliczek: some things may be permissions gates, but some may also be things the hardware doesn't support
<Zakim> johnpallett, you wanted to ask whether the privacy & security explainer has been used as an input to the session creation conversationon
johnpallett: so i don't have
answers. but i think part of the discussion is on inputs and
that's also been happening on the privacy and security
repo
... there are two conversations here, one is the challenges
with the permissions structure
... the other thing is a partial list of what the UA may wish
to ask user consent for
<dom> Immersive Web Privacy and Security
johnpallett: happy to have the discussion but a lot of this info already exists
<dom> Explainer += [Cameras, Permissions] #15
NellWaliczek: should we recap this, or table for now?
johnpallett: can recap what i did at tpac
NellWaliczek: let's strike for now, talk about it on the next WG call and sync with johnpallett later
cwilso: i think we should have this conversation now
bajones: to give johnpallett a bit of time to prep let's switch to talking about hit testing now
NellWaliczek: i have a couple PRs in the queue
<johnpallett> johnpallett is ready to present the privacy slides from TPAC
NellWaliczek: first is about viewer space, building on the work that was done to unify pose retreival behavior
<dom> Add XRSession.viewerSpace and tidy XRSpace explanations #491
<johnpallett> (aimed at chairs) :)
NellWaliczek: #491, #492,
#493
... #491 adds a viewer space object so you can relate viewer to
other xr spaces in the world without mathematics
<dom> Restructures input-explainer.md to improve clarity and prep for hit-testing proposal #492
NellWaliczek: #492 should be fairly noncontroversial, not adding anything just refactoring. i realized that when getting into hit testing i wasn't sure where to add things so i restructured it to get a more logical flow
<dom> Adding an explainer for real-world hit testing #493
NellWaliczek: will merge both
unless someone complains by tomorrow
... moving to #493
... big shout out to max and blair and (?) and alex who did
some important prep work on the hittesting repo that was open
on the CG
... based on a bunch of the investigations and explorations
...
... i spent a bucnh of time thinking about how to feather in
the requirements for real world hit testing that would work
across all platforms and hardware
... some points: #1 many uas are structured such that the
tracking stuff runs in a separate process from the user's
tab
... we have a choice where we can take that behavior and make
it an async request, but this makes it near impossible to
render a stable cursor
... the alternate is registering for hit test events from a
particular source, which is useful for cursors
... with async you can have results packaged with the xrframe
object
... if you look at how xr input sources are defined: they're
created and destroyed during a tap, which isn't great if you're
egistering async handlers
... we want to do this in a way that avoids undesirable perf
hits on folks who don't wish to use this for hittesting
...
https://github.com/immersive-web/webxr/blob/eeb899d38657a6c2bded097566dc41912c2bb8da/hit-testing-explainer.md#requesting-a-hit-test-source
... here's an example
... i've added an alternate to address some concerns about
delay
...
https://github.com/immersive-web/webxr/blob/eeb899d38657a6c2bded097566dc41912c2bb8da/hit-testing-explainer.md#automatic-hit-test-source-creation
... the dev that actually wants to opt in to hit testing can do
so by providing a hit test source
... the third use case is when you want to just do a single hit
test
...
https://github.com/immersive-web/webxr/blob/eeb899d38657a6c2bded097566dc41912c2bb8da/hit-testing-explainer.md#hit-test-results
... goal for today is to make y'all familiar with this
... it's our first "real AR" thing
... on top of this is what we can use to build anchors
... thinking through it: if we want to place an anchor that
position is also tied to a frame
klausw: quick comment, we
probably need to return the hit test result relative to the
current frame (not the frame it was requested on)
... should we support both options?
NellWaliczek: yeah we can support both too
<alexturn> +q to talk about which frame
klausw: as long as we don't require impls to support both bc it may not be easy
<Zakim> klausw, you wanted to say distinguishing spec-level permissions from UA dialogs and recommendations? and to say I think impl restrictions may require hit test results for the frame
max: just wanted to say i love
this, it's great, it's addressing things i haven't thought of
before
... based on my understanding of the general use cases for
async you generally want to just place an object
somewhere
... i think ergonomically this is great, works well with thread
boundaries, great job!
RafaelCintron: correct me wrong
but the current way XRFrame is specced it's only valid within
the rAF of the session
... so if we use old frames they may not work when the user
calls functions on them
NellWaliczek: yeah, XRFrames are
currently short lived, but we could pin XRFrames till their
promises go away
... not sure how worried we should be about this
bajones: (this is mostly nell's
work, but): one thing i wanted to point out here is in the
ideal world you want to ask every single frame to give
instantaneous hit results immediately
... but it's not feasible
... the tradeoff is: do i want to know instantaneously what the
hit test is , i have to schedule that ahead of time so i get
results on a future frame
... the other alternative is "i want to knwo a one time hit
from this ray, which comes back to me whenever". in that
scenario it's questionable how much you care about the exact
data of the frame it happened on
... in most cases it doesn't matter *exactly* where the hit
occurred. if the accuracy *is* critical we can do the hit test
source route which gets us sync results
... i think this api design gives us a balance between "i need
to know exactly" (slightly more latent), vs "i need to know
basically" (faster, but inaccurate)
RafaelCintron: the fact remains we need to store around all the XRFrame data
NellWaliczek: one alternative
design is to put the request on the session object with a
promise
... feedback im looking for is not whether or not this is a
problem : if this PR is a good start at a high level
... would hate to hold up the whole PR on individual issues we
can file
<Zakim> johnpallett, you wanted to ask nell to cover virtual scenario
johnpallett: could you cover the virtual object stuff, particulaly how the 3d engine would hook into what the intent of this is
NellWaliczek: when i originally
sketched out the sample code i forgot we don't have
occlusion
... so i wrote it so that you request virtual hit test wrt your
pose and scene graph, and looked at which object was closer,
and that's what you got
... it's problematic since real world objects can get in the
way of the hit test
... sample code i put in here is that if you get a virtual hit
test esult that will always win, since you may have
accidentally put a real world object in the way
... and app devs don't have enough info to prevent that
bajones: one quick footnote: the TLDR of virtual hit testing is "engine code goes here", the spec doesn't actually give more than examples
johnpallett: one possible answer is to somehow extend this with things which are rendered with xr and things that are rendered with .. other things, but there are privacy concerned
<Zakim> alexturn, you wanted to talk about which frame
NellWaliczek: (this is just sample code, not part of the spec)
alexturn: in terms of timing it seems like there are three times here, time a (button press), time b (frame hit test query), time c (hit test is answered, in a different frame)
<johnpallett> johnpallett figured out that the virtual object hit test section of the PR was purely sample code and not part of the spec. Sorry it took me a bit to get there.
alexturn: the middle frame where you happened to make the request but didn't have the answer seems a bit arbitrary, it's probably not meaningful so we shouldn't focus too much on it
cabanier: it seems like you could have hundreds of hit tests at the same time, that let you scan the entire room.
NellWaliczek: they have to come
from a source, can't have an offset on them. last time we
decided we were not worried about this
... the idea was that you have an AR light vs full mode and the
full mode just lets you hit test
cabanier: but such a session would need to ask for perms, yes
NellWaliczek: yes, this *must* be in an ar immersive session
<blair> we talked about AR Lite as being controlled by the UA, so it would be compatible with this
NellWaliczek: we can have lower-perm apis that let devs simply specify floor/etc info that won't leak issues
<blair> Ummmm ... what?
ada: you were saying you can only use this in immersive ar, what about inline-ar?
NellWaliczek: no such thing :)
<blair> what do you mean there is no lite thing?
<blair> oh, there's no inline, sorry. right.
inline-ar
we have inline-vr
<blair> audio in webex is crap, can't hear most of what's being said
max: i think we can just go ahead
with what you have now and file issues. this async version will
interact with anchors, so getting this out of the way so that
we can answer questions about anchors is good
... this can be a convo we revisit as we move further
(sounds of agreement across the room)
<blair> +1
NellWaliczek: straw poll on the PRs?
<adrian> test
<max> +1
<dom> +1
<bajones> +1
+1
<dkrowe> +1
<johnpallett> +1
<blair> I'm unable to understand what most people are saing ... lol
<adrian> +1
<cwilso> +1
<dulce303> +1
<blair> +1
<Kip> +1.1
<alexis_menard> +1
<daoshengmu> +1
<klausw> +1
<dom> +♥
<trevorfsmith> +1
<max> @blair :(
<bertf> +1
<blair> yes
<josh_marinacci> +1
<jungkees> +1
<blair> I vote in favor of all the mubling
<blair> mumbling
<trevorfsmith> +🌸
cwilso: i declare this passed, go forth and merge
<blair> good job, nell
NellWaliczek: congratulations! we now actually are XR!
<trevorfsmith> WOOT
(loud clapping and whooping across the room)
cwilso: 10min break
... we have rewrangled the schedule
... not going to cover privacy (which we added) bc john wants
us to look at the repo
bajones: there are several different topics we have clustered around dealing with different FOV forms
<dom> Remove ProjectionMatrix in favor of FOV values? #461
bajones: starting with #461
... we have two ways of doing FOV
... (snip)
... we also have largely for historical reasons a viewmatrix ,
a thing you can feed directly into webgl
... and there's also the projection matrix, a 16 element
float32 array for feeding directly into webgl, which gives you
all of the FOV values for a given view encoded into the
format
the matrix is quite flexible
scribe: this can include many
things like scale, view, etc
... mostly not used. but we have heard in the past, certain
devices (hololens) broke the entire spec bc you can do
arbitrary projection matrices
... we want to undo those changes now, maybe?
... is there a use for continuing to provide matrices but can
we instead break this down into fov angles? some apis do this
(oculus)
... impetus: i've heard anecdotally that people are using the
projection matrix for fov values. i'm not sure if there's
hardware out there that supports arbitrary projection matrices
over FOV values
... would like to ask if we can/should remove the matrix in
favor of FOV or just keep it
cabanier: i've talked with the
openxr folks who are running into the same issue, and we do
need the 16val matr
... they say if you go and do away with the matrix the result
will look blurry
bajones: okay, that's an
important datapoint -- if we can identify at least one piece of
hardware with this limitation we must handle it
... it's not unreasonable similar issues will crop up on other
hardware
... not sure how we can address cases where people decompose
these matrices into the four values
... and now you've broken this code on some harware
Klaus_Weidner_google: the way i understand it is that each view can have different directions for their transforms which can already cause issues for a threejs camera
etc
Klaus_Weidner_google: the projection matrix in ML is just a normal projection matrix
<Zakim> Manishearth, you wanted to give direct access to nullable fov values
Manishearth: perhaps we should just give them access to FOV values which are nullable so they're forced to think about it
bajones: folks may just make assumptions based on hardware they have and end up with lots of nulls
<alexturn> +q to talk about shader assumptions
bajones: kernel of truth: we should find out what the data folks are trying to actually get out is, and perhaps provide that directly
<Zakim> dkrowe, you wanted to see if we can provide both a matrix and additional data with no assumptions
dkrowe: is there also some value
in giving values like FOV with the understanding that folks
will still use projection matrix
... i.e. will people construct projection matrix from the
FOV
<kaiping> U
bajones: even when there are red
flags all over the place saying use THIS not THAT people will
still do it wrong from looking in the devtools or inspector or
something and it gets copied around
... we can't necessarily save people from themselves ;)
NellWaliczek: that said we try to
create pits of success, not pits of failure
... carefully providing bits of ata like "the culling FOV" (not
the FOV) is one way to solve it
bajones: we can also do things like providing a culling frustum
<Zakim> alexturn, you wanted to talk about shader assumptions
<leonard> +q
alexturn: what do we do for the
actual rendering projection, how do we communicate this to the
user
... do apps assume that the two views are parallel, etc
... for the most part i've found that the projection
assumptions are less about your engine ingesting the
matrix
... but you shaders may not expect weird projection
matrices,
... it may not be an actual big deal, but perhaps we will see
it a lot, we should go for more compat
... the last part is: sometimes the way between this balance of
power vs success is a bit of a negotiation process, so if an
app is going to make such assumptions by default we provide a
"normal" proection matrix and engines can request the real
matrix
s/request this/request the real matrix/
bajones: this is a good point. there's a difference between only providing the wrong info to a subset of users: doing the wrong thing by defualt vs people stumbling upon the wrong thing
<adrian> on the user end this is driven by copypasta
a l
leonard: so this is a quick question: i know the matrices won't cause problems, but for simpler things will this interfere with or disable orthographic projections
bajones: i can't imagine what this kind of thing would be like
leonard: handheld or architectural situation?
bajones: might be out of scope,
the way our pipeline works has high affinity to this
... high affinity to projection views. have a hard time
envisioning it
... regardless, if such a thing were to happen, providing proj
matrices is probably the safer bet
<adrian> I'm afraid of the wrong results (correct on most headsets) becoming a snippet on stack overflow or something, but which is wrong on a lesser used platform (like say ML)
<adrian> so I would say projection matrices should be the only input, leaving decomposition to the engine/user
<adrian> for which there would be copypasta code for sure
<Zakim> Manishearth, you wanted to an example use case
<Zakim> NellWaliczek, you wanted to ask about inline fovs
Manishearth: real world use case: porting an existing application that cares about these values. correct solution is to rewrite, but that's cumbersome
NellWaliczek: i'd hate for us to be in a situation where we encourage devs to always use projection in immersive but not in inline
<Zakim> klaus_google, you wanted to say quick handwaving of what is and isn't covered by FOV angles for matrices
klaus_google: to the best of my understanding if you express angles/etc as these matrices you can represent any rectangular thing as these. as long as the screen is rectangular you should be good with just angles
NellWaliczek: but that leads to the q: as that scrolls up the page (in inline), how do we give devs control over which way they want it to work
<Kip> Assumption: No curved displays in inline? https://images.techhive.com/images/article/2013/10/lg_g_flex_02-100066355-orig.jpg
bajones: straw poll time!
<cwilso> +1 if you think we should stick with a projection matrix, -1 if you think we should explore some other structure
bajones: who in the room feels reasonably strongly that we should stick with a projection matrix, or who feels we should be investigating alternatives to the proj matrix for communicating that info
<klaus_google> to clarify, left/right/top/bottom angles (+near/far distance) should be able to express a projection matrix for a rectangular screen in an arbitrary orientation in space, including tilted HMD screens. Forward vector points in a perpendicular direction to the plane the screen is in.
<cabanier> +1
<ada> +1
<RafaelCintron> +1
<adrian> +1
<klaus_google> -1
<dkrowe> +1
alexturn: are we voting on the
exact format of delivery or the constraints we have?
... i.e. if we say we vote for a matrix are we voting for a
fully general matrix
<trevorfsmith> +1
bajones: we are voting on a highly documented matrix or a hopefully self documented structure
<art> +1
<leonard> +1
<jungkees> +1
<Kip> =1
-1
<Kip> +1
<alexis_menard> +1
<alexturn> -1
bajones: straw poll result: no
clear conclusion. prob means addl discussion
... we should jump around to more FOV topics
... moving to #272
... great many cases in XR where we have the projection terms
dictated to us
... hardware says "this is how you show it or it's wrong"
<dom> Default FoV for magic window canvases #272
bajones: i.e. i'm on my phone,
just looking at the screen, for inline content, there's nothing
dictating what the FOV should be
... there's really nothing to go off of unless you have fancy
tracking tech
... ideally devs should be able to give us feedback on what the
fov should be
... (a) what should the default fov be?
... (b) also give devs a way to specify what they want the fov
to be
... do we want to allow devs to specify horiz or vertical FOV,
sometimes they want one sometimes they want the other
<leonard> +q
bajones: what would be a sensible way to come to a reasonable default FOV so when you create a session .. something useful comes out even if you haven't set the numbers
(?): certainly within a mobile context you can figure out some of these numbers like skew wrt eyes
(unsure if i got that right, it was very fast)
bajones: there are some cases
where we know exactly where your eyes are wrt that canvas
... this does bud up into privacy issues
... bc with this info you can do gaze tracking, which
advertisers would just ADORE
... so we need to have some perms model before we expose this,
so we still need a reasonable default
klaus_google: if you're using a
phone the actual fov the screen extends over is (??)
... possible just return a null proj matrix and let the app
figure out that it's arbitrary
bajones: would love to avoid
people having lots of if statements in their render
pipelines
... don't want this to get wires crossed with immersive
mode
<Zakim> Kip, you wanted to recommend allowing user override (eg, wider to reduce dizziness)
<Zakim> klaus_google, you wanted to say do we even need a default FOV? Return a null projection matrix to clarify it's arbitrary?
<klaus_google> ^^^ 15:05 clarification "screen extends over" - the actual angular extent of a phone screen at arm's length is very small, so you'd only see a small part of the scene when you use it as-is. Usually you'd want an arbitrary larger FOV for that scenario.
leonard: 0.4 - 0.5 is good in my experience
bajones: we could allow for crazy
nonuniform scaling but this can break assumptions
... i feel like we should allow for both and default to horiz
if we have to guess
<klaus_google> ^^^ 15:07 clarification for "0.4 - 0.5" - that's 0.4-0.5 times Math.PI. 0.5 times PI would be 90 degrees.
<adrian> why not just leave it up to the UA
<adrian> and make it choose something "reasonable"
<dkrowe> +q for changing FOV after inline session begins
Nick-8thWall: just a couple
observation from what we see: at 8thwall we encourage devs to
not just show the camera feed, so we ask them to expand the
FOV
... so we actually encourage devs to be aware of this issue and
handle it themselves so to me it shouldn't be a requirement in
the api to return a fake value
... you're already communicating things like your clipping
planes in these apis
so it's not too hard ask what the fov should be
Nick-8thWall: so it's not too hard to ask what the fov should be
bajones: fascinated with the point about ar without the camera feed, thank you for bringing it up
Kip: mine kinda segues into that.
i shipped a game with a magic window without a camera
feed
... we got some insights about why people want the FOV
... there's a bit of a sensitive issue related to a11y. age of
users also came into effect. also limitations of eyesight
... given the ability to vary FOV they were able to hold the
devices more comfortably
... giving the UA control over FOV lets us tweak these things
per-user
bajones: this is a good point.
the structure of our api means the UA always has the final say,
so this is all hinting to the UA
... so the UA can have the ability to do what it wants
Kip: also, streaming contexts (e.g. on twitch) may wish to tweak this
<Zakim> dkrowe, you wanted to discuss changing FOV after inline session begins
dkrowe: question about fov in
inline sessions i had: i can see the alue of having a
consistent rendering path but if someone already has some
application (some webgl app with a camera etc) and they're
using the inline session just to get the tracking
... (with a headset it makes sense to use the matrices), if you
have an inline session you may want to switch between multiple
cameras but in immersive you may want to lock to one
... rephrasing: seems like making them specify fov upfront ,
does it mean like this is what folks will opt in to to keep
their rendering path
bajones: if you're talking about
an existing webgl app i feel like they won't take advantage of
the inline capabilities
... but if it's ground-up immersive we already have feedback
that folks want to keep a single rendering path
... we have an api like updaterenderstate where changes you
make occur in the next frame, so you can't change properties in
the middle of a frame. but you can change them at any point in
the run time, and if you gave such a hint you can use that to
do things like pinch zoom
... it's not just something you set at creation time and it
stays forever
... (this is an old issue we should update)
<Zakim> NellWaliczek, you wanted to respond to nick
NellWaliczek: i want to comment
and clarify something Nick-8thWall said
... you said folks were using 8thwall as a control mechanism
for an opaque experience
... in the context of webxr this would be an inline session
that has gone fullscreen with tracking access
... it's not actually ar as you're not seeing the world, but
just using tracking -- it's an "opaque experience" that has
access to tracking
... related to the questions of should we or shouldn't we allow
inline sessions to have matrices go through XR, i don't see
much value here
... i feel they should be specced to be nullable but non-null
for immersive
... nothing XR adds to the mix by having them be non nullable,
aside from *one* case: drawing rays for a screen tap
... catch is if we do this we can't have screen based input
sources to behave well
... doesn't seem unreasonable to have a special system but
limited to inline sessions
... i think we have enough info to say: we don't want arbitrary
defaults
... and dev can provide them
bajones: i don't think we want to say we don't want arbitrary defaults
NellWaliczek: but we need a way to set them from the outset
bajones: agreed
<Zakim> johnpallett, you wanted to ask Kip whether they have details on how precise the custom FOV needs to be
NellWaliczek: okay, i think i have enough to make a PR here
johnpallett: followup about a11y
from Kip's comment: is there a fingerprinting risk?
... and do you have data on how precise this needs to be?
Kip: so the ui wasn't an analog control screen, it was basically a pinch-zoom-ish ui, which we can roughly quantize
johnpallett: can you list specific use cases?
Kip: some people needed to hold
the device further away to compensate for eyesight, but now you
need to view a narrower FOV so you can see the text
... and others aer holding it closer and need a wider fov
johnpallett: will look into research from a privacy pov
<josh_marinacci> if we have extra time I would like to do a lightning talk.
RRSAgent please draft the minutes
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/eifjccibejtvirnghnrbldhbccfcghidvthcrvefihfr// Succeeded: s/bajonoes/bajones/ Succeeded: s/#10/#1)/ Succeeded: s/ @ /_@_/g Succeeded: s/veiwer/viewer/ Succeeded: s/request this/request the real matrix/ FAILED: s/request this/request the real matrix/ Present: cwilso madlaina-kalunder Jillian_Munson dom NellWaliczek lgombos trevorfsmith Alberto_Elias_(remote) Atsushi_Shimono_(remote) Fernando_Mozilla_(remote) Hirokazu_Egashira_(remote) Laszlo_Gombos_(remote) Artem_(remotely) bertf jungkees ada Madlaina_Kalunder_(remotely) Phu_Le_(remotely) Winston_(remotely) Tony_Brainwaive WARNING: No scribe lines found matching ScribeNick pattern: <Manish\ Goregaokar> ... Found ScribeNick: johnpallett Found Scribe: Manish Goregaokar Found ScribeNick: Manishearth ScribeNicks: johnpallett, Manishearth Agenda: https://github.com/immersive-web/administrivia/tree/master/F2F-Jan-2019 WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]