Conversation

Dart, C#, Java, JavaScript internally uses UTF16. And They won't be able to change it to UTF8 even in the future. Otherwise it would require a complete re-do of the String API and loose compatibility. Also UTF8 does not solve the security problems like I said
2
1
This isn't accurate. JavaScript doesn't have UTF-16 strings. If it did, every JavaScript string could be represented with standard UTF-8. JavaScript implements strings as arrays of UCS2 characters which is not standard Unicode or UTF-16. That's why an extended UTF-8 is required.
1
3
JavaScript has mixed UCS2 and UTF16 api. Some methods like codePointAt, fromCodePoint, toUpperCase/toLowerCase, for..of str, Array.from(str), normalize, regex's test are unicode aware. The rest are USC2 include substr and length. But it doesn't really make much difference...
2
The strings are inherently not UTF-16 because they can contain invalid Unicode. UTF-16 doesn't support any string that isn't also supported by UTF-8. There is no lossy conversion between UTF-8 and UTF-16. They're different encodings of the same thing able to represent it all.
1
std::string is just a dynamic array of bytes able to manage a terminating NUL for C string compatibility. It doesn't have any specific encoding. Being able to stick UTF-8 in std::string doesn't mean it provides any form of UTF-8 support or that it's a UTF-8 string type.
1
There's no point of making this disingenuous strawman argument. The fact is that JavaScript has legacy, broken strings and hasn't been fixed to support what Unicode has been for the past 21 years. JavaScript can be easily fixed rather than propagating brokenness further.
2
There are people born after UCS2 was obsoleted who are now 20. Having 21 years to migrate to having real UTF-16 is a long time. WebAssembly is not inherently tied to JavaScript and shouldn't be harmed to accommodate JavaScript not moving on from the 90s. Fix JavaScript instead.
1
Show replies