jQuery HTML parser corrupts script content that looks like HTML tags (including strings) #2668

JakeQZ · 2015-10-20T01:37:00Z

With functions like .append(html), .repaceWith(html) or even just $(html), if html contains javascript and some content of that script looks like an HTML tag, it will be modified undesirably.

E.g.
<script type='text/javascript'> alert('<What?/>'); </script>
when used as argument for .append() or suchlike produces "<What?></What>" in the alert when it should produce "<What?/>"

Example at http://jsfiddle.net/fwaoj0eh/2/ (where you can also see benign [in this case] replacement of <p/> with <p></p>), with workaround (and possible route to fix?). Bug present in jQuery 1.11.3 on all Windows browsers tested (IE7-12, FF, Chrome, Safari, Opera).

The text was updated successfully, but these errors were encountered:

gibson042 · 2015-10-20T02:21:27Z

Thanks for the report! This is related to (and a near-duplicate of) trac-14370 and trac-14464. It is possible to address on your end with jQuery.htmlPrefilter in 3.0, and the implementation in PR #1374 would be a good starting point. I'm still not sure about putting something like it in the official build, though.

JakeQZ · 2015-10-20T02:40:39Z

Yes, 14370, but not 14464 (which is not a bug). Seems the fix for 14370 was considered too big. But it ought to be possible to exclude content within <script...</script> without too much more regex?

JakeQZ · 2015-10-20T02:44:09Z

Above should read 'content within <script...</script>'

dmethvin · 2015-10-20T13:22:06Z

This is definitely unintended behavior, but screwing up this case isn't necessarily that bad. Putting executable scripts in HTML is something we'd love to remove, or at least make it much more difficult to do, because of its security implications. It's way too easy to inject attacker content that has scripts in it. For now I think we need to just reinforce the advice that it's bad practice, and use this example as one place where it fails.

gibson042 · 2015-10-20T14:03:04Z

But it ought to be possible to exclude content within <script...</script> without too much more regex?

https://github.com/jquery/jquery/pull/1374/files#diff-169760a97de5c86a886842060321d2c8R44 , though I would probably update the first part per 85ffc6d and change the [^>]*s to (?:[\x20\t\r\n\f]+[^\0-\x20\x7f-\x9f="'/>]+(?:[\x20\t\r\n\f]*=[\x20\t\r\n\f]*(?:"[\w\W]*?"|'[\w\W]*?'|[^\x20\t\r\n\f>]+)|))* for maximum correctness.

JakeQZ · 2015-10-20T22:12:47Z

FWIW, the use case in which I found the issue was obfuscation of contact details using AJAX with JavaScript that converts sequences of apparently random characters back into the original human readable form.

I don't see anything particularly bad practice about injecting content into the DOM that has come from a trusted source via AJAX and may by design contain script, more generally.

However, I can't think of a good case, other than encryption, where such sequences in string literals being modified unexpectedly would be anything other than benign (e.g. if the AJAX-returned script also contains DOM manipulation code, $('<image/>') and $('<image></image>') are equivalent, and if plain Javascript, the XML special characters are not used (though I don't know about other frameworks). I am also not aware of any case where /> is valid Javascript outside a string literal.

So I would concur that adding something to the documentation 'Additional Notes' which begin "By design, any jQuery constructor or method that accepts an HTML string..." would resolve this ticket.

Perhaps: "Also note that if the HTML contains script with string literals from cryptographic functions, unexpected results may occur, e.g. "<EX@mple./>" will become "<EX@mple.></EX>". This is also by design, as jQuery is intended to be lightweight, and it is not expected to be an issue other genuine situations."

Something like this would have saved me several hours debugging, as I'd already scanned the docs and scoured t'internet to see if I might be missing something in the way .append(), etc. work.

But I don't know what to suggest as a workaround other than what's in the jsFiddle I've updated: http://jsfiddle.net/fwaoj0eh/3/, which is working fine in my case.

(I haven't looked into whether things like $('<image title="For historic reasons <image/> tags must be self-closed in HTML"/>') have issues, but some hackery types, on seeing an additional note on the limitations, may seek other ways of breaking it, which is probably a Good Thing...)

dmethvin · 2015-10-20T23:04:17Z

I don't see anything particularly bad practice about injecting content into the DOM that has come from a trusted source via AJAX and may by design contain script, more generally.

From a Content Security Policy standpoint, it's very common to disable inline scripts as a policy because it's such a common attack vector. The problem is that is is really hard for most web developers to strongly assert that the content is from a trusted source. There are so many ways to mess that up.

(I haven't looked into whether things like $('<image title="For historic reasons <image/> tags must be self-closed in HTML"/>') have issues

Characters inside the title attribute should be HTML-encoded. If you were building that title dynamically it's an example of the "so many ways to mess that up" mentioned above that could result in XSS.

JakeQZ · 2015-10-26T02:16:10Z

On your first point, why [rhetorically] is Google search results page full of inline script?
On your second, my bad example. Was trying to come up with other possible related failures, and failed.
Would like to move this to documentation issue, but don't seem able to...

gibson042 · 2015-10-26T15:07:04Z

On your second, my bad example. Was trying to come up with other possible related failures, and failed.

Not at all, <image title="For historic reasons <image/> tags must be self-closed in HTML"/> is valid HTML5.

Would like to move this to documentation issue, but don't seem able to...

https://github.com/jquery/api.jquery.com/issues/new . Thanks!

timmywil · 2015-10-26T16:17:52Z

We will put together a htmlPrefilter example for the docs specifically for this issue. See jquery/api.jquery.com#727.

JakeQZ · 2015-10-27T02:16:57Z

Thanks all for your input on this issue.
Now it is closed, I'd just like to point out that because JS is client-side only, if you have a security issue with inline JS, then you MUST HAVE A SECURITY BREACH SOMEWHERE ELSE, and really should be looking there.

gibson042 added the Manipulation label Oct 20, 2015

timmywil mentioned this issue Oct 26, 2015

Document jQuery.htmlPrefilter jquery/api.jquery.com#727

Closed

timmywil closed this as completed Oct 26, 2015

gibson042 mentioned this issue Jan 8, 2016

jQuery.htmlPrefilter: add new entry jquery/api.jquery.com#858

Merged

arvgta mentioned this issue Apr 13, 2017

Uncaught SyntaxError: Unexpected identifier - bounty of €100 arvgta/ajaxify#125

Closed

arvgta mentioned this issue Aug 12, 2017

SyntaxError on page change - bounty of €100 arvgta/ajaxify#132

Closed

lock bot locked as resolved and limited conversation to collaborators Jun 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jQuery HTML parser corrupts script content that looks like HTML tags (including strings) #2668

jQuery HTML parser corrupts script content that looks like HTML tags (including strings) #2668

JakeQZ commented Oct 20, 2015

gibson042 commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

dmethvin commented Oct 20, 2015

gibson042 commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

dmethvin commented Oct 20, 2015

JakeQZ commented Oct 26, 2015

gibson042 commented Oct 26, 2015

timmywil commented Oct 26, 2015

JakeQZ commented Oct 27, 2015

jQuery HTML parser corrupts script content that looks like HTML tags (including strings) #2668

jQuery HTML parser corrupts script content that looks like HTML tags (including strings) #2668

Comments

JakeQZ commented Oct 20, 2015

gibson042 commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

dmethvin commented Oct 20, 2015

gibson042 commented Oct 20, 2015

JakeQZ commented Oct 20, 2015

dmethvin commented Oct 20, 2015

JakeQZ commented Oct 26, 2015

gibson042 commented Oct 26, 2015

timmywil commented Oct 26, 2015

JakeQZ commented Oct 27, 2015