Conversation

This Tweet was deleted by the Tweet author. Learn more
Dart, C#, Java, JavaScript internally uses UTF16. And They won't be able to change it to UTF8 even in the future. Otherwise it would require a complete re-do of the String API and loose compatibility. Also UTF8 does not solve the security problems like I said
2
1
This isn't accurate. JavaScript doesn't have UTF-16 strings. If it did, every JavaScript string could be represented with standard UTF-8. JavaScript implements strings as arrays of UCS2 characters which is not standard Unicode or UTF-16. That's why an extended UTF-8 is required.
1
3
JavaScript has mixed UCS2 and UTF16 api. Some methods like codePointAt, fromCodePoint, toUpperCase/toLowerCase, for..of str, Array.from(str), normalize, regex's test are unicode aware. The rest are USC2 include substr and length. But it doesn't really make much difference...
2
If you have UCS2 arrays and bolt on broken APIs for using them as Unicode strings, they're still UCS2 arrays. Even if the APIs worked properly... they're still UCS2 strings. They could fix it, but they lack the will to switch to working Unicode and stop supporting broken stuff.
All I mean is that UTF-8 and UTF-16 encoding for strings can and should be an implementation detail where code units aren't heavily exposed. Code points aren't glyphs but they at least aren't an implementation detail of encoding and are something with actual meaning to people.
1
Show replies