Works or B0rks?
```
/\uD83D�/u.test('
')
```
Surrogate pairs are trouble for regex class ranges with and without u-flag. You can have
, which is two (16-bit) code units forming the code point x1f4a9. You can encode it as \u{1f4a9}, even as \uD83D\uDCA9. But can you mix it?
Extending the list with properties *is* a no-brainer, and we do it each year when the Unicode Standard updates. However, the properties in *this* proposal behave a little different than anything we currently support. Hence the separate proposal. The readme should clarify it
-
-
\p{ASCII} (or anything else that’s already supported in JS) currently always matches a single code point at a time. But these new properties could match multiple code points! Thus, the main point of discussion is: re-use \p{…} syntax, or add new syntax?https://github.com/tc39/proposal-regexp-unicode-sequence-properties/issues/10 …
-
Another fun monkey wrench: Unicode fell into the habit of doing a single major Unicode + Emoji release in June (moved to March in 2019) once a year, so updates to data files could be somewhat predictably scheduled. But...pic.twitter.com/VgZ1PU66RL
- 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
JavaScript, HTML, CSS, HTTP, performance, security, Bash, Unicode, i18n, macOS.
