Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-images] image-orientation:none violates same-origin policy #5165

Open
annevk opened this issue Jun 4, 2020 · 70 comments
Open

[css-images] image-orientation:none violates same-origin policy #5165

annevk opened this issue Jun 4, 2020 · 70 comments

Comments

@annevk
Copy link
Member

annevk commented Jun 4, 2020

As I realized in whatwg/html#5603, this leaks an additional bit of information for opaque responses.

cc @heycam

@annevk
Copy link
Member Author

annevk commented Jun 4, 2020

An alternative to consider is to act as if images generated from opaque responses never have EXIF data (or any kind of metadata other than width/height/ratio). Requiring each feature that can disable or override some EXIF data to take such images into account feels very brittle to me.

@noamr
Copy link
Collaborator

noamr commented Jun 4, 2020

An alternative to consider is to act as if images generated from opaque responses never have EXIF data. Requiring each feature that can disable EXIF data to take such images into account feels very brittle to me.

I think that that would create a requirement that would make people enable CORS when they didn't otherwise have to - just to have their images display correctly.

I think there's no reason why this requirement should be on the usage of EXIF, instead on the feature that overrides (and thus exposes) EXIF, such as image-orientation and image-resolution CSS properties.

Otherwise, it feels like we're trying to prevent a threat of a hypothetical future API. Is that a necessary thing to do?

@annevk
Copy link
Member Author

annevk commented Jun 4, 2020

They're not hypothetical, this API (image-orientation) already exists, whatwg/html#5603 adds another. Making the APIs enforce the policy violates the principle of least privilege and will likely lead to numerous bugs.

@heycam
Copy link
Contributor

heycam commented Jun 5, 2020

cc @stephenmcgruer

@heycam
Copy link
Contributor

heycam commented Jun 5, 2020

If we do have to prevent image-orientation from working on images that came from opaque responses, it would be nice if we could unconditionally apply the orientation (and I guess pretend from the whatwg/html#5603 APIs that there was no orientation metadata), so that we can try to treat orientation as an implementation detail of the image file representation. But that would make it tricky for authors wanting to use image-orientation: none to turn off the new re-orientation effects for their pages that rely on it not being applied.

@noamr
Copy link
Collaborator

noamr commented Jun 5, 2020

If we do have to prevent image-orientation from working on images that came from opaque responses, it would be nice if we could unconditionally apply the orientation (and I guess pretend from the whatwg/html#5603 APIs that there was no orientation metadata), so that we can try to treat orientation as an implementation detail of the image file representation. But that would make it tricky for authors wanting to use image-orientation: none to turn off the new re-orientation effects for their pages that rely on it not being applied.

I think that's a better approach... the threat comes from the "overriding" feature, not from the implementation detail of using EXIF. An image format may similarly have an internal representation of orientation/resolution supported internally in the decoder - would that also be limited to same-origin/CORS images?

EXIF is not the issue here - it's the mixing of image-originated data and markup-originated data, which is something that currently occurs only for naturalWidth/naturalHeight.
If we want to take a more generic approach - I think it should tackle those blurred boundaries between content and markup.

@annevk
Copy link
Member Author

annevk commented Jun 5, 2020

@noamr again, it's not just overriding, it's also reading as linked above. There's various different ways this will end up being exposed.

@heycam how do we model it in such a way that we don't need security checks all over?

I guess what we could do is that we take the orientation into account for decoding purposes, but don't store it as a field on the resulting image if it was generated from an opaque response. So it appears rotated, but if you query its metadata it'll return the default orientation values.

The tricky aspect is when metadata can be overridden, as it can be here. If the internal representation still has non-default metadata you would need to take that into account somehow. I.e., if an image was rotated 90 degrees and an API asked for it not to be rotated, it would have to remain rotated at 90 degrees. Model-wise that follows from the preceding paragraph, but in implementations that might be a bit trickier.

@noamr
Copy link
Collaborator

noamr commented Jun 5, 2020

@noamr again, it's not just overriding, it's also reading as linked above. There's various different
ways this will end up being exposed.

Sure, I meant reading/overriding - anything that enables reading directly or indirectly.

The tricky aspect is when metadata can be overridden, as it can be here. If the internal representation still has non-default metadata you would need to take that into account somehow. I.e., if an image was rotated 90 degrees and an API asked for it not to be rotated, it would have to remain rotated at 90 degrees. Model-wise that follows from the preceding paragraph, but in implementations that might be a bit trickier.

I think that the overriding features in this case should be disabled for the opaque resource. E.g. CSS image-orientation would simply not apply, maybe even regarded as invalid style. I think that would be reasonable implementation-wise.

@annevk
Copy link
Member Author

annevk commented Jun 5, 2020

I think the model we end up with shouldn't require each new feature to check whether the image was generated from an opaque response. So if some theoretical feature allowed setting EXIF rotation to 90 and the opaque image already had it 90 (exposed as 0 per the above model), it should be as if it was 180 and be exposed to the world as 90.

@noamr
Copy link
Collaborator

noamr commented Jun 5, 2020

I think the model we end up with shouldn't require each new feature to check whether the image was generated from an opaque response. So if some theoretical feature allowed setting EXIF rotation to 90 and the opaque image already had it 90 (exposed as 0 per the above model), it should be as if it was 180 and be exposed to the world as 90.

I like that. Makes metadata "embedded" into the image for opaque images, but still working as expected if there's nothing that tries to override/read it.

@heycam
Copy link
Contributor

heycam commented Jun 6, 2020

I guess what we could do is that we take the orientation into account for decoding purposes, but don't store it as a field on the resulting image if it was generated from an opaque response. So it appears rotated, but if you query its metadata it'll return the default orientation values.

I think this would be the right approach. The intrinsic orientation spec concept for opaque images would be "zero degrees, no flip", and the image dimensions (whether the image is opaque or not) would have the orientation taken into account. image-orientation's behavior would then be written in terms of the image's intrinsic orientation.

@noamr
Copy link
Collaborator

noamr commented Jun 15, 2020

cc @chrishtr @smfr

@noamr
Copy link
Collaborator

noamr commented Jun 21, 2020

I don't see a lot of movement on this ticket... does any implementer have an opinion about this?
It's currently blocking whatwg/html#5574, and these same-origin policy violations are already in the wild... would be good to figure out if we see EXIF orientation/(resolution) data as a cross-origin information leak, and if it is, how to mitigate it.

@chrishtr
Copy link
Contributor

chrishtr commented Jul 1, 2020

I didn't see it stated very clearly clearly in this issue, so let me first state what I think the information leak is:

Developers can detect whether there is EXIF rotation information in an image by rendering it twice - once with image-orientation: from-image and one with image-orientation: none, and observing if there is a difference in the layout size of the result.

Therefore, for a cross-domain image, the developer can obtain one bit of information about these images.

However, don't sites already know multiple "bits of information" about cross-origin images, such as their width and height?

@noamr
Copy link
Collaborator

noamr commented Jul 1, 2020

I didn't see it stated very clearly clearly in this issue, so let me first state what I think the information leak is:

Developers can detect whether there is EXIF rotation information in an image by rendering it twice - once with image-orientation: from-image and one with image-orientation: none, and observing if there is a difference in the layout size of the result.

Therefore, for a cross-domain image, the developer can obtain one bit of information about these images.

Yes, and same for a potential implementation of image-resolution, and for querying image orientation from javascript (whatwg/html#5602).

However, don't sites already know multiple "bits of information" about cross-origin images, such as their width and height?

I think the only bits of information they know right now is an image's width and height. Is exposing related information such as orientation/density a problem? It's hard for me to fathom how that info can be used, but it's difficult to be certain.

@annevk
Copy link
Member Author

annevk commented Jul 2, 2020

I think it is a problem. There's a long history of all these communication channels leading to privacy and security issues. We should hold the line where we can and clearly we can here.

@yoavweiss
Copy link

One concrete scenario that can be problematic:
PhotoSharing.example allows non-CORS cross-origin fetching of credentialed images, but only for logged-in users or users that belong to a certain group (which the image was shared with).
PhotoSharing.example already knows about the width and height leak, as well as timing attacks that may result from it not serving the image in the disallowed cases. As a result, it creates an empty image with the same dimensions and makes sure that the response timing looks similar to the real deal (without setting Timing-Allow-Origin on neither image).

But, if the original image contains orientation or resolution information, adding those capabilities would surprise PhotoSharing folks and cause them to potentially expose log-in state or group affiliation across origins.

It seems like this is a problem that will go away when browsers limit cross-origin credentials, but we're not there yet.

Would it make sense to only respect orientation/resolution for CORP enabled images? CORP seems like a clear signal saying the image can be embedded. I wonder what would be the impact of that on deployability.

/cc @eeeps @mikewest @arturjanc

@annevk
Copy link
Member Author

annevk commented Jul 6, 2020

I don't want to drag a dependency on CORP into this. That would change its semantics from a boolean check in fetch to a property of the response. All CORP: cross-origin means is that you're okay being side-channel attacked. This is not a side channel.

@noamr
Copy link
Collaborator

noamr commented Jul 6, 2020

I found a scenario in the related issue whatwg/html#5574 where some indirect means can be used to figure out the image's resolution. See this comment. I am convinced that this needs to be addressed.

Recapping the two current proposals (following IRC discussion with @annevk):

  1. Ignore metadata for opaque-response images
  2. Bake the metadata in for opaque-response images (e.g. rotate and scale the image but ignore that notion when applying CSS rotation/srcset scaling).

In either case, a cross-origin image might appear different depending on which origin is embedding it. In (1), it will appear different by default. In (2), it will appear different only in certain cases. e.g. when CSS image-rotation, image-resolution or srcset is being used, or in future scenarios that we are not yet aware of.

Also both (1) and (2) would require changes in current implementations, as image-orientation: none is already shipped.

I believe that (1) is easier to implement and grasp, however, it would have a higher chance of breaking some current sites using EXIF-rotated images (if the images are cross-origin and don't have the CORS headers).

@yoavweiss
Copy link

IMO, a CORS restriction would be too restrictive, and will pose a significant deployment hurdle.
@annevk - can you expand on why this wouldn't work well with CORP? It seems to be leaking the exact same information that CORP "allows" you to leak.

/cc @camillelamy

@annevk
Copy link
Member Author

annevk commented Jul 6, 2020

CORP is about allowing a Spectre-read gadget to potentially get at your data. It's not an assertion that it's fine to make that data public and it's not guaranteed that Spectre-read gadgets will be able to get at it forever. Otherwise we might as well have required CORS there as the initial design did. Also, all data, not just the metadata.

@noamr
Copy link
Collaborator

noamr commented Jul 15, 2020

I found a scenario in the related issue whatwg/html#5574 where some indirect means can be used to figure out the image's resolution. See this comment. I am convinced that this needs to be addressed.

Recapping the two current proposals (following IRC discussion with @annevk):

  1. Ignore metadata for opaque-response images
  2. Bake the metadata in for opaque-response images (e.g. rotate and scale the image but ignore that notion when applying CSS rotation/srcset scaling).

In either case, a cross-origin image might appear different depending on which origin is embedding it. In (1), it will appear different by default. In (2), it will appear different only in certain cases. e.g. when CSS image-rotation, image-resolution or srcset is being used, or in future scenarios that we are not yet aware of.

Also both (1) and (2) would require changes in current implementations, as image-orientation: none is already shipped.

I believe that (1) is easier to implement and grasp, however, it would have a higher chance of breaking some current sites using EXIF-rotated images (if the images are cross-origin and don't have the CORS headers).

Blocking metadata with CORS completely could cause an issue I haven't thought of earlier - it means that CSS-loaded images can't use orientation/resolution, as those don't expose a crossorigin attribute (which is currently only meaningful for canvas drawing). OTOH CSS-loaded images don't leak any of the metadata information as the image's size is not readable and doesn't affect layout.

In addition, it would require regular images to start including crossorigin when using a CDN, just to have their image displayed correctly. That doesn't seem reasonable.

As today so many images are CDN-delivered and don't bother with a crossorigin attribute (or can't because the image is CSS-loaded), I think it's a blocker for using (1) - it would make image orientation and resolution less than usable.

CORP seems less suitable as it's meant to block embedding at all, not just reading of metadata.

I believe that this should be blocked with an additional HTTP header (yikes), similar to Timing-Allow-Origin, or not at all - servers who want to offer this kind of protection to their images can bake the metadata into that image and not expose it.

@noamr
Copy link
Collaborator

noamr commented Jul 15, 2020

TL;DR: proposing an HTTP header (maybe Media-Transform-Allow-Origin), similar to Timing-Allow-Origin. If that header is not present, image orientation/resolution from EXIF should be ignored.

@camillelamy
Copy link
Member

I am wondering if there would be value in having a combined header like Metadata-Allow-Origin where we can specify which kind of metadata are allowed for which origins (eg, timing, image orientation/resolution). This way, when a similar issue comes up next we can extend this header instead of defining a new one.

@noamr
Copy link
Collaborator

noamr commented Jul 17, 2020

I am wondering if there would be value in having a combined header like Metadata-Allow-Origin where we can specify which kind of metadata are allowed for which origins (eg, timing, image orientation/resolution). This way, when a similar issue comes up next we can extend this header instead of defining a new one.

Sounds like an interesting alternative, kind of like Access-Control-Allow-Headers.
Maybe something like this would have sufficient granularity:
Metadata-Allow-Origin: *; Metadata-Allow-Properties: Orientation,Resolution,Timing

@tabatkins
Copy link
Member

OTOH CSS-loaded images don't leak any of the metadata information as the image's size is not readable and doesn't affect layout.

They do, fwiw - ::before { content: url(...); } creates an anonymous replaced box containing the specified image, which will affect layout (or makes the pseudo-element itself into a replaced element containing the image, to the same effect).

In either case, a cross-origin image might appear different depending on which origin is embedding it. In (1), it will appear different by default. In (2), it will appear different only in certain cases. e.g. when CSS image-rotation, image-resolution or srcset is being used, or in future scenarios that we are not yet aware of.

Just because it'll still allow images to look correct by default, I lean strongly toward (2). Each potentially-exposed bit of metadata just needs to define a "default" value that it'll masquerade as for the purpose of in-page manipulations. This is trivial for orientation, but I guess resolution will have to pretend to be 1x? That'll break srcset (it'll density-correct images twice), but that might be unavoidable here.

@annevk
Copy link
Member Author

annevk commented Oct 15, 2020

They already are more difficult to use, if you don't use CORS to fetch them you cannot paint them on canvas and then read from that canvas, for instance. If you want to make full use of a cross-origin image, use CORS.

@noamr
Copy link
Collaborator

noamr commented Oct 15, 2020

This middle ground approach where some metadata (width and height) is available and others like orientation are not...

It's not a middle ground. It's as strict as we can make it, considering the unfortunate situation where exposing width/height for cross-origin images is something that is probably too late to backtrack.

@annevk
Copy link
Member Author

annevk commented Nov 6, 2020

https://bugzilla.mozilla.org/show_bug.cgi?id=1655598 tracks this for Firefox.

@JiaboHu
Copy link

JiaboHu commented Apr 1, 2021

I guess what we could do is that we take the orientation into account for decoding purposes, but don't store it as a field on the resulting image if it was generated from an opaque response. So it appears rotated, but if you query its metadata it'll return the default orientation values.

Such a big change deserves a big announcement, at least an online conference on youtube to give developers enough time to understand and update their apps.

@mzur
Copy link

mzur commented May 28, 2021

I am the maintainer of an image annotation web application and in a similar position than @philcunliffe (only that I wasn't aware of this before it was implemented). Now I'm caught cold with the implementation in Chromium which totally breaks our application without any way to implement backwards compatibility. We already went to great lengths to ensure backwards compatibility with the previous breaking change of always respecting the EXIF orientation (I also commented on that here but nobody seems to be interested there anymore).

I realize that it's probably too late and the discussion is already over but I wanted to leave a note that the previous decision to apply EXIF rotation by default and now this decision are huge breaking changes for some of us. Sorry for the rant, I'm just a little frustrated.

@dlrobertson
Copy link
Member

Just a note: When fixing this in Gecko with bug 1655598, it was not clear to me that this should use cors-same-origin. This was later found and fixed in bug 1822116, but it might be nice if it was explicitly stated that cors-same-origin should be used.

@annevk
Copy link
Member Author

annevk commented Apr 28, 2023

@dlrobertson FWIW, the discussion above talks about "opaque responses". You don't get an opaque response when using CORS. You either get a non-opaque response or a network error.

@dlrobertson
Copy link
Member

@annevk makes sense. Thanks for the clarification!

@DavidJCobb
Copy link

DavidJCobb commented Feb 14, 2024

In the context of image dimensions already leaking across origins, making image-orientation require CORS comes across as security theater, not as an actual mitigation of a real threat. Is there any informational aspect to orientation, any meaning that it could possibly ever have, besides the presence or absence of a swap operation on two pieces of information (width and height) that everyone is already allowed to read?

They already are more difficult to use, if you don't use CORS to fetch them you cannot paint them on canvas and then read from that canvas, for instance. If you want to make full use of a cross-origin image, use CORS.

In what world is this use case even remotely comparable to simply placing an image in a document with the expectation that you can ensure it displays properly for your use case? How are intermediate-to-advanced HTML5 canvas shenanigans and scripted interactivity regarded as being on the same level as a bog-standard <img> tag?

@eeeps
Copy link
Contributor

eeeps commented Feb 14, 2024

@DavidJCobb

How are intermediate-to-advanced HTML5 canvas shenanigans and scripted interactivity regarded as being on the same level as a bog-standard <img> tag?

Because exploits will take advantage of anything that's possible, and we should have ~zero-tolerance for creating new avenues for user harm. I sketched out an example to answer this question for myself, here:

Let’s say there’s an image URL – https://coolbank.com/hero.jpg, that happens to return a different resource depending on whether or not a user is currently logged in at coolbank.com.

https://css-tricks.com/i-learned-to-love-the-same-origin-policy/

@philcunliffe
Copy link

I've had a couple years to debrief on this now and while I still could argue that "embedding instructions" don't belong in the same bucket as other much more sensitive EXIF data I think the solution is mostly fine.

As someone who was hit hard by this change my one wish would be that the W3C and big browsers treat breaking changes with more respect. The decisions of this group directly affect the largest application platform in the world.

The fact that many developers were caught completely off guard by a breaking change (with no option for backwards compatibility) to the worlds largest application platform seems like a process failure somewhere. In the image annotation industry alone this decision cost a lot of real peoples jobs and changed the flow of tens of millions of dollars at a minimum.

I'm guessing people in this body will say that this is entirely the browsers responsibility during implementation but there's an inescapable human element to these decisions now given the scale. Maybe the W3C should give guidelines for a time delay and warning mechanisms when a breaking decision is made? Even a simple console message when the image-orientation rule was present for 6 months before the change was actually made would have made a big difference.

@DavidJCobb
Copy link

DavidJCobb commented Feb 15, 2024

I sketched out an example to answer this question for myself, here [...]

If exposing the intrinsic size of an image is enough to compromise an app, then that app is already insecure. Your example has us suppose the existence of a website that discloses image dimensions in a context where doing so is insecure, yet only discloses them in the specific ways that this change to the spec would block, and not in the very closely related, far more common, far more likely, and far more obvious ways that are currently treated (and will for the foreseeable future always be treated) as exceptions to same-origin restrictions.

You can contrive exceptions and edge-cases like, "Oh, but it only exposes the size of an image under this incredibly specific scenario that I hand-crafted explicitly to justify this one niche change to the spec that damages backward-compatibility," but that's equivalent to saying, "The problem isn't web developers shooting themselves in the leg. It's that they can aim at their leg, miss, and have the bullet ricochet off the floor and directly into their leg. The problem is that spot on the floor, and we should urgently get rid of it by annihilating the floor tiles there with a sledgehammer."

Cross-origin restrictions are a valuable principle by default, but it's not an absolute principle that's always intrinsically correct for the web, and it can't be cited as if it is one: the web is already filled with functionality that demands exceptions to it -- like being able to embed images across origins without setting up CORS -- and EXIF orientation belongs with that existing functionality. Like, what I'm seeing here is that focusing purely on what's practical, it's very easy to offer a straightforward disagreement with this change ("EXIF orientation is the same kind of data as intrinsic sizing information, which is already exposed") and it's very hard to offer a straightforward agreement with this change ("What if a web developer makes a site that's insecure but only in the exact ways that this otherwise nonsensical and inconsistent-with-everything-around-it change would band-aid for them, and also, what if we pretended that plain image tags are the same kind of content as scripted and interactive HTML5 canvases?").

This is security theater.

@eeeps
Copy link
Contributor

eeeps commented Feb 15, 2024

I suppose I am more absolute, because it makes the security model of the web 1) stronger 2) easier to reason about. Web developers and users don't have to worry about this entire class of security problems if the platform can take care of it for them. My take is more like "unauthorized cross-origin reads of any kind should have never been allowed, and we certainly shouldn't open up new ones." Intrinsic sizing being exposed was, in hindsight, a mistake, and there are active efforts to fix it (although this is a very hard thing to do in a web compatible manner). In your foot-gunning metaphor, I wouldn't say I'm for "annihilating the floor", but I am for comprehensive gun control, and wish we'd had it thirty years ago.

@noamr
Copy link
Collaborator

noamr commented Feb 15, 2024

Coming back from this after a few yeats - one same-origin policy violation (image dimensions) cannot justify another (orientation).

Because the former has been around for decades, authors had years to protect against this and be aware that image dimensions are something that's exposed to browsers and that's life. We have to be rigorous about not allowing new leaks because there is lots of content out there that is not protected against them.

Really the focus should go towards reducing the usage of no-cors in the web and moving towards CORS, even in CSS.
I've recently updated the CSS spec (https://drafts.csswg.org/css-values-5/#request-url-modifiers) to allow CORS images, however that's not implemented in browsers. I think implementing that would be a step forwards to actually resolving this in a holistic manner.

@yoavweiss
Copy link

I'd even vote for a Document Policy that forces all subresources to be loaded with e.g. implicit crossorigin attributes. That would enable developers to always load their images CORS enabled and not worry about any of this.

@noamr
Copy link
Collaborator

noamr commented Feb 15, 2024

I'd even vote for a Document Policy that forces all subresources to be loaded with e.g. implicit crossorigin attributes. That would enable developers to always load their images CORS enabled and not worry about any of this.

Yea I think we would need both, because some pages reference existing images that might not be CORS-enabled.

@philcunliffe
Copy link

@yoavweiss @noamr

Where would be the right spot to open an issue about breaking change guidelines? I believe it fits under the mission statement for W3C to develop a standard operating procedure.

@noamr
Copy link
Collaborator

noamr commented Mar 5, 2024

@yoavweiss @noamr

Where would be the right spot to open an issue about breaking change guidelines? I believe it fits under the mission statement for W3C to develop a standard operating procedure.

Perhaps @jyasskin can advise here.

@jyasskin
Copy link
Member

jyasskin commented Mar 5, 2024

I don't have a good answer for you. Back in the day, there were documents like https://www.w3.org/TR/qaframe-spec/ that specified how to write specifications, but I don't think we have that sort of meta-WG right now in any of the standards bodies I follow. Instead, we have statements like https://whatwg.org/working-mode#removals that defer to implementer opinions, and https://www.chromium.org/blink/launching-features/#feature-deprecations that governs how Chrome in particular makes this sort of decision. The AB might be an appropriate body to think about this at the W3C-wide level, since they have an Incubation activity considering the other end of the pipeline. You can suggest they look at breaking changes by filing an issue in https://github.com/w3c/AB-public/issues. The CSSWG could also create an opinion for itself, but that wouldn't advise any other WGs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests