Works or B0rks?
```
/\uD83D�/u.test('
')
```
Surrogate pairs are trouble for regex class ranges with and without u-flag. You can have
, which is two (16-bit) code units forming the code point x1f4a9. You can encode it as \u{1f4a9}, even as \uD83D\uDCA9. But can you mix it?
\p{ASCII} (or anything else that’s already supported in JS) currently always matches a single code point at a time. But these new properties could match multiple code points! Thus, the main point of discussion is: re-use \p{…} syntax, or add new syntax?https://github.com/tc39/proposal-regexp-unicode-sequence-properties/issues/10 …
-
-
Another fun monkey wrench: Unicode fell into the habit of doing a single major Unicode + Emoji release in June (moved to March in 2019) once a year, so updates to data files could be somewhat predictably scheduled. But...pic.twitter.com/VgZ1PU66RL
-
The Emoji 12.1 release came out of the blue last month, with only 3 weeks warning. https://emojipedia.org/emoji-12.1/ Apple's Emoji 12.0 coverage included new gender-neutral ZWJ sequences (like Adult+Astronaut
) and gender neutral bases, and Unicode had to scramble to document it.pic.twitter.com/THHjKwewmT
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
JavaScript, HTML, CSS, HTTP, performance, security, Bash, Unicode, i18n, macOS.
