Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-fonts] Clarification font-variant-emoji should not affect characters 0-9#* #11014

Open
yisibl opened this issue Oct 9, 2024 · 18 comments
Open
Labels

Comments

@yisibl
Copy link
Contributor

yisibl commented Oct 9, 2024

Only the code points listed by Unicode as contributing to a Unicode emoji presentation sequence are affected by this property. Within this CSS specification, these characters are referred to as Emoji Presentation Participating Code Points. This property has no effect on any other characters.

Consider the following use case:

data:text/html;charset=UTF-8,<!doctype html>
<style>
  .test { font-size: 52px; }
  .test { font-variant-emoji: emoji; font-family: Arial; }
</style> 

<div class="test">⬆379 kB/s</div>

image

The author only wants to change the style of the arrow (⬆︎), but when font-variant-emoji: emoji is applied, the font-family of the number 379 also becomes the font-family of the Emoji. this is something the author doesn't want.

Although, UTS#51 states that 0-9#* have the Emoji property, I think they should be excluded from font-variant-emoji.

emoji_character := \p{Emoji}
https://www.unicode.org/reports/tr51/#def_emoji_character

Since Chrome has shipped, we need to clarify this in the CSS specification as soon as possible.

I've already filed a bug for Chrome: https://issues.chromium.org/issues/369781730

cc @svgeesus @drott @jfkthame @markusicu

@drott
Copy link
Collaborator

drott commented Oct 9, 2024

IMO there is no particular need or utility in the CSS spec deviating from the Unicode spec.

Only the code points listed by Unicode as contributing to a Unicode emoji presentation sequence are affected by this property. Within this CSS specification, these characters are referred to as Emoji Presentation Participating Code Points. This property has no effect on any other characters.

Digits, * and # belong to presentation sequences. They have character class Emoji=Yes, EmojiPresentation=No. Excluding them from font-variant-emoji would break the property. font-variant-emoji: emoji should upgrade digits to emoji.

It seems to me the clarification you desire is that the styled font should not switch for value text of the font-variant-emoji when the styled font-family that is selected for the text can cover these characters. This is what Munira (@tursunova) who implemented this in Chromium, already fixed in the Chromium issue https://issues.chromium.org/issues/369781730.

I don't think we need a normative change for that.

Code points are rendered as if U+FE0E VARIATION SELECTOR-15 was appended to every Presentation Participating Code Point.

should be sufficient.

@drott drott added the css-fonts-4 Current Work label Oct 9, 2024
@yisibl
Copy link
Contributor Author

yisibl commented Oct 9, 2024

IMO, font-variant-emoji: text/emoji should not change the font-family of 0-9#* and should always use font-family: Arial as set by the author.

If the author wants to use an Emoji font, it should be used with vs15 or vs16 (e.g. <U+0039, U+FE0F>). Otherwise this will break the page styling and cause the confusion in my screenshot above.

If you don't want to change the font-family of a Digits, the author has to wrap the specific Emoji in a new element via an element that becomes:

data:text/html;charset=UTF-8,<!doctype html>
<style>
  .test { font-size: 52px; font-family: Arial; }
  .emoji { font-variant-emoji: emoji; }
</style>
<div class="test"><span class="emoji"></span> 379 kB/s</div>

It would be horrible if, in order to change the styling, we had to wrap each emoji with a separate element. And there are times when we can't change the HTML structure of that content.

This greatly reduces the ergonomics of font-variant-emoji.

@drott
Copy link
Collaborator

drott commented Oct 9, 2024

Yes, for value text the family should not change, and the spec does not suggest otherwise. For value emoji it is the purpose of the property to upgrade to emoji presentation, and render glyphs as if they were followed by VS16.

If the author wants to use an Emoji font, it should be used with vs15 or vs16 (e.g. U+0039, U+FE0F). Otherwise this will break the page styling and cause the confusion in my screenshot above.

Yes, if an emoji upgrade is not intended or desired, font-variant-emoji: still has values unicode and normal and I think the value unicode matches what you want here.

@yisibl
Copy link
Contributor Author

yisibl commented Oct 9, 2024

For value emoji it is the purpose of the property to upgrade to emoji presentation, and render glyphs as if they were followed by VS16.

My point is that this scenario makes no sense for Digits to render as emoji fonts, and I've never seen a web page use it that way.

font-variant-emoji: still has values unicode and normal and I think the value unicode matches what you want here.

Still the same problem I mentioned, this requires a change in the HTML structure.

Also, the unicode keyword changes the font-family.

data:text/html;charset=UTF-8,<!doctype html>
<style>
  .test { font-size: 52px; font-family: Arial; }
  .normal { font-variant-emoji: normal; }
  .emoji { font-variant-emoji: emoji; }
  .text { font-variant-emoji: text; }
  .unicode { font-variant-emoji: unicode; }
</style>

<div class="test normal">0123456789*</div>
<div class="test unicode">0123456789*</div>
<div class="test emoji">0123456789*</div>
<div class="test text">0123456789*</div>

2024-11-27 Update

image

Old Chrome

image

@jfkthame
Copy link
Contributor

jfkthame commented Oct 9, 2024

Yes, if an emoji upgrade is not intended or desired, font-variant-emoji: still has values unicode and normal and I think the value unicode matches what you want here.

font-variant-emoji: unicode doesn't achieve what @yisibl wants here, because the up-arrow is U+2B06. Checking Unicode's emoji-data.txt, it has the Emoji property:

2B05..2B07 ; Emoji # E0.6 [3] (⬅️..⬇️) left arrow..down arrow

but does not have Emoji_Presentation:

27BF ; Emoji_Presentation # E1.0 [1] (➿) double curly loop
2B1B..2B1C ; Emoji_Presentation # E0.6 [2] (⬛..⬜) black large square..white large square

so its default presentation will be the text form.

In terms of the Unicode Emoji and Emoji_Presentation properties, U+2B06 UPWARDS BLACK ARROW has the same properties as the ASCII digits, so it is expected that font-variant-emoji will affect them in the same way.

@drott
Copy link
Collaborator

drott commented Oct 9, 2024

Also, the unicode keyword changes the font-family.

It shouldn't and is likely be covered by the fix for https://issues.chromium.org/issues/369781730 which should be coming to Canary soon, please check again once we have a release with the fix and let's discuss implementation issue on the Chromium tracker, not here.

@jfkthame
Copy link
Contributor

jfkthame commented Oct 9, 2024

It seems to me that if the author wants an emoji-style up-arrow here, but does not want the rendering of the following digits to be affected, the right approach would be to encode the arrow as <U+2B06, U+FE0F>. That's how you control the presentation of a specific character. Applying font-variant-emoji to a wider range of text inherently risks affecting other characters besides the intended one.

@yisibl
Copy link
Contributor Author

yisibl commented Oct 9, 2024

We should distinguish between Unicode property and what is actually needed in CSS. The Unicode may have a lot of legacy compatibility issues that make it impossible to adjust Emoji property, but CSS has a chance to fix them.

From the user's point of view, standalone Digits should not be in the scope of Emoji.

For example, in the popular npm package emoji-regex, Digits will return false.

@tursunova
Copy link

My point is that this scenario makes no sense for Digits to render as emoji fonts, and I've never seen a web page use it that way.

You are correct that in this scenario using coloured (emoji style) glyphs for digits doesn't make sense, but some emoji fonts have coloured glyphs for ASCII digits, so I assume there might be some use cases when authors might want to use emoji digits, don't you think so? If you want to make standalone digits as not emojis, then, I assume, you won't be able to use coloured emoji digits at all since you would not be able to even apply VS16 to change the presentation to emoji, isn't that right?

@yisibl
Copy link
Contributor Author

yisibl commented Oct 9, 2024

but some emoji fonts have coloured glyphs for ASCII digits, so I assume there might be some use cases when authors might want to use emoji digits, don't you think so?

Usually, we have the following ways to solve it:

  1. Use @font-face(Suppose the font Nabla-Regular is available locally).
    But more often than not we use webfont, which ensures consistent styling across all OS.
    There are many such icon libraries on iconfont:
    https://www.iconfont.cn/collections/detail?spm=a313x.collections_index.i1.d9df05512.4aaa3a81DONmCk&cid=7450
    https://www.iconfont.cn/collections/detail?spm=a313x.collections_index.i1.d9df05512.4aaa3a81DONmCk&cid=21232
@font-face {
  font-family: "color_digits";
  src: local("Nabla-Regular");
  unicode-range: U+30-39;
}
.test {
  font-family: "color_digits", Airal;
  font-size: 36px;
}

image

  1. Use VS15 or VS16 (e.g. <U+0039, U+FE0E> or <U+0039, U+FE0F>)

Overall, I think the usage you're talking about is almost non-existent in actual websites. There seems to be no reason not to use webfont.

you won't be able to use coloured emoji digits at all since you would not be able to even apply VS16 to change the presentation to emoji, isn't that right?

Sorry, I don't understand what that means.

@Crissov
Copy link
Contributor

Crissov commented Oct 10, 2024

The special thing with digits is that VS16 alone does not make them RGI emojis, because they also need U+20E3 COMBINING ENCLOSING KEYCAP at the end of the sequence.

@yisibl
Copy link
Contributor Author

yisibl commented Oct 23, 2024

@jfkthame

It seems to me that if the author wants an emoji-style up-arrow here, but does not want the rendering of the following digits to be affected, the right approach would be to encode the arrow as <U+2B06, U+FE0F>. That's how you control the presentation of a specific character.

Often the text entered is not under the control of the CSS author, such as when it is entered by a third party user from a CMS.

We can't expect third-party users to learn how to enter a Variation Selector, or even just copy and paste an emoji character from somewhere else.

That's why CSS authors need to change it to emoji style with font-variant-emoji: emoji, which I believe is one of the most common scenarios where this property is used.

Applying font-variant-emoji to a wider range of text inherently risks affecting other characters besides the intended one.

The purpose of this issue is to minimize such risks, and digits are the most obvious ones. CSS should not give authors unexpected negative surprises.

@yisibl
Copy link
Contributor Author

yisibl commented Nov 6, 2024

Is this worth adding to the agenda? I'd quite like to see this progress.

@yisibl
Copy link
Contributor Author

yisibl commented Nov 27, 2024

Yes, if an emoji upgrade is not intended or desired, font-variant-emoji: still has values unicode and normal and I think the value unicode matches what you want here.

The unicode keyword doesn't turn all emoji into color style, that's why we need to set it to the emoji keyword.

In general, for the need to change digits to a colored font itself, CSS authors usually use Webfont or VS16 instead of font-variant-emoji (See #11014 (comment)). font-variant-emoji: emoji should not change the font-family of a digit at any time.

@drott
Copy link
Collaborator

drott commented Nov 27, 2024

The unicode keyword doesn't turn all emoji into color style, that's why we need to set it to the emoji keyword.

You can use the emoji keyword if you use the font-variant-emoji property correctly and with the right granularity. Or you can use VS15/VS16 with the unicode mode of the property.

From your original report, and as @jfkthame explains above, one correct way to upgrade the arrow in ⬆379 kB/s is to place a span around the arrow and not apply font-variant-emoji: emoji to the full <div> element, another correct way is to use VS16 for the arrow and set font-variant-emoji to unicode.

font-variant-emoji: emoji should not change the font-family of a digit at any time.

Why not? Several emoji fonts contain color digits, not only for the keycap sequences. Emoji 2.0 in Unicode introduced emoji digits.

@yisibl Are you adovcating for a specific text with exceptions for codepoint ranges in how the font-variant-emoji property is applied? In that case we would end up with something roughly saying: "use unicode properties, except for... 0-9"?

The request to exclude digits from font-variant-emoji would be an arbitrary Unicode exception for the web. This is a distinction that from the Chrome side we oppose to making in the CSS fonts spec. What would we do we do with *, and #, what do we do with the trademark sign, the heavy plus sign, etc. - this would make for a really odd precedent.

From the Chrome side, we have strict implementation constraints to not deviate from Unicode for character properties. We include ICU, and we use emoji-segmenter - which are general purpose libraries for Unicode behavior. If we would do a web-specific special casing of numbers, this causes additional complications in using these libraries - and we do not want to override their behavior.

  • Digits have the Emoji=Yes property and so with a VS16 they can be upgraded to emoji presentation.
  • Multiple emoji fonts support colored digits, not only colored digits with keycaps.

@yisibl If you strongly believe standalone numbers should not be emoji, I believe this would instead be a discussion for Unicode to remove the Emoji character class from the digits, or for requests to emoji font producers to remove the colored digits from them.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed [css-fonts] Clarification font-variant-emoji should not affect characters `0-9#*` .

The full IRC log of that discussion <chrishtr> fantasai: this issue was opened by someone pointing out that font-variant-emoji currently has values to say do-default or change emoji to more text-like or more emoji presentation-like
<chrishtr> fantasai: this makes it easy for people to ask for emoji to look the way they want
<chrishtr> fantasai: problem is digits have emoji versions, and authors are usually not asking for those be emojified
<chrishtr> fantasai: request is to accept the digits, # and & to be excluded from font-variant-emoji
<chrishtr> s/&/*/
<chrishtr> fantasai: we could also add a keyword saying include everything, but they can already do that via variation selectors
<chrishtr> florian: possible in content, not styling
<chrishtr> fantasai: correct
<moonira> q+
<chrishtr> fantasai: think it's reasonable to emojify things that aren't digits. Marking up all digits is annoying.
<Rossen1> ack moonira
<chrishtr> moonira: Elika, you said we only can do that in content, not styles, but I'm not sure I understood that properly.
<chrishtr> moonira: dominik mentioned in his last comment we can also use span elements on those digits to achieve the desired outcome
<chrishtr> fantasai: yes, but the commenter is saying that digits are commonly used and rarely do they want emoji styling. Forcing the author to put spans around every digits is a lot of extra work.
<fantasai> (and might not even be possible in their system)
<chrishtr> moonira: also, the are other code points that are defined by unicode as emojis, like the hash sign, asterisk, that are commonly used as text and not emoji
<chrishtr> fantasai: we should have a value that makes exceptions for these characters, so they can request extra
<chrishtr> florian: the point is interesting because there is stuff in-between.
<chrishtr> florian: for digits you almost always want to exclude, but less often for these other ones
<chrishtr> fantasai: request includes digits, # and *
<chrishtr> moonira: I don't understand users want to use digits and other symbols mentioned mostly as text from, the point was made that some emojis are more ambiguous. For example, we can use font-variant-emoji Unicode, but digits in text presentation and Unicode presentation for others?
<chrishtr> moonira: there is an option to do that with Unicode keywords...
<chrishtr> fantasai: the problem is that the Unicode keywords use the Unicode defaults, which are oriented around backwards compatibility in text.
<chrishtr> fantasai: e.g. to avoid emoji staring to show up in math and science textbooks
<moonira> q+
<chrishtr> fantasai: in cases where you want to emojify your text font-variant-emoji does that, but the commenter is saying that this is too aggressive and a better default is to exclude some of those symbols
<chrishtr> fantasai: think it makes sense to accommodate this request, but in CSS instead of Unicode
<Rossen1> ack moonira
<fantasai> The 'unicode' value is a good default, but it is necessarily conservative.
<fantasai> This is a request to be more aggressive in emojification, but the 'emoji' value as currently defined is too aggressive.
<chrishtr> moonira: also, implementation-wise we use commonly used libraries like ICU that follow unicode standards. It makes more sense to raise the same issue in the Unicode standard. That would allow us to avoid performance problems due to these exclusions.
<jfkthame> q+
<chrishtr> moonira: should we raise it in the Unicode group instead?
<Rossen1> ack joshtumath
<Rossen1> ack jfkthame
<fantasai> s/aggressive/aggressive for these common uses/
<chrishtr> jfkthame: wanted to comment that while I am sympathetic to the request, I am sympathetic to Dominik's comment in the issue expressing an unwillingness to create exceptions to Unicode.
<chrishtr> jfkthame: I'm uneasy about that, and where to draw the line
<Rossen1> joshtumath 😁
<chrishtr> jfkthame: there are other symbols used in text that have the emoji setting, such as trademarks, copyrights, make/female symbols. not sure we want to be in that business.
<fantasai> s/not sure/it's a difficult line to raw, and not sure/
<chrishtr> rossen: let's continue the discussion in the issue
<chrishtr> florian: do you mean that therefore it's an insoluble problem (or best solved in Unicode as Munira suggests)?
<chrishtr> florian: is it possible for Unicode to solve this or impossible for them too?
<chrishtr> jfkthame: I would be happier to see it solved in Unicode than patched in CSS. Not sure any solution would be perfect, but there could be a Unicode property to represent this.

@yisibl
Copy link
Contributor Author

yisibl commented Nov 27, 2024

I'm staying up late waiting for the results of this meeting, and I'm glad some progress has been made.

Thank you all for taking the time to discuss this! 🤝🤝

I'll give a summary of some of the possible doubts:

Why not use the unicode keyword?

Because it doesn't allow common emoji like ⬆︎, ❤︎ to be in color style.

Why not nest <span> in emoji characters?

Again, I want to emphasize that there is a lot of content on the web that is not under the control of CSS authors, and that is completely triangulated by user input (e.g., text typed into a comment box). The cost of nesting spans with numbers in this content is very high.

In the meeting above, @fantasai also expressed the same idea as mine.

moonira: dominik mentioned in his last comment we can also use span elements on those digits to achieve the desired outcome

fantasai: yes, but the commenter is saying that digits are commonly used and rarely do they want emoji styling. Forcing the author to put spans around every digits is a lot of extra work.
(and might not even be possible in their system)

Looking for a Unicode solution?

From the Chrome side, we have strict implementation constraints to not deviate from Unicode for character properties.

@drott So, am I to understand that as long as there is a relevant property in Unicode to exclude digits, Chrome is happy to implement it?

If I understand correctly, RGI_Emoji_Qualification in UTS 51 seems to be a solid solution, where 0-9*# is not included.

See:

@drott
Copy link
Collaborator

drott commented Nov 28, 2024

Why not use the unicode keyword?
Because it doesn't allow common emoji like ⬆︎, ❤︎ to be in color style.

It does, with VS16.

If I understand correctly, RGI_Emoji_Qualification in UTS 51 seems to be a solid solution, where 0-9*# is not included.

RGI_Emoji_Qualification is not suitable, because

  • it is not a character property, but an enumeration of strings. (we need the former in emoji segmentation)
  • is too restrictive, as it enumerates the RGI Emoji set - which is different from what can occur in fonts and as you point out, misses certain used sequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants