Which is why .NET has Utf-8 strings...? At least I think.https://twitter.com/BrendanEich/status/879372251547656193 …
-
-
Replying to @milquetoastable
UTF-8 is a Unicode Transfer Format, or whatever it is called now. Problem in Java & JS was abstraction leak: s.length counts 16-bit units.
1 reply 1 retweet 3 likes -
Replying to @BrendanEich @milquetoastable
So "retconning" JS/Java '90s-Unicode as UTF-16 doesn't work: .length cannot change meaning w/o compat break. See https://www.ecma-international.org/ecma-262/7.0/index.html#sec-properties-of-the-string-prototype-object … +
1 reply 1 retweet 0 likes -
Replying to @BrendanEich @milquetoastable
where charCodeAt and codePointAt (more recent), index to .length vs. @
@iterator (also ofc recent), differ: former from '90s, latter UTF-16.1 reply 1 retweet 1 like -
Still problems with "Extended grapheme clusters" which change depending on date and humidity. "

" was length 2, then length 1, now length 22 replies 0 retweets 1 like -
Cc:
@awbjs and lol humidity - but how is this ECMA-262's problem?1 reply 0 retweets 0 likes -
Replying to @BrendanEich @milquetoastable and
Not their responsibility, but it is everyone's problem. Hard to count "characters" by Unicode codepoints (still, better than UCS-2 units!)
1 reply 0 retweets 0 likes -
Replying to @FakeUnicode @BrendanEich and
(Best solution we've seen is continually updated dependency, like https://github.com/mathiasbynens/emoji-regex … for emoji, by
@mathias)1 reply 0 retweets 0 likes
Once Unicode property escapes land in the spec, I plan on formally proposing an extension that makes it easier to match sequences of symbols
-
-
Replying to @mathias @FakeUnicode and
That would eventually make emoji-regex redundant. But first things first. Let’s get JSC/SpiderMonkey/Chakra to implement \p.
0 replies 0 retweets 2 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
JavaScript, HTML, CSS, HTTP, performance, security, Bash, Unicode, i18n, macOS.