Conversation

This Tweet was deleted by the Tweet author. Learn more
Replying to and
Should ideally avoid becoming a problem elsewhere and should be solved for JavaScript. They could add a document/program wide mode where valid Unicode strings are enforced and then people can opt-out of the problem. Can require it to use new features like they often do with TLS.
1
Replying to and
I don't really think they need new APIs or API redesign to fix JavaScript's Unicode issues. Need a way to opt-out of legacy strings. In practice, not much would break, and the breakage would be a nice way of uncovering a lot of latent bugs that are potentially already serious.
1
Replying to and
It would then be using UTF-16, which is still unfortunate due to wasted memory, engine complexity from optimizations to avoid wasting as much memory, conversion overhead, etc. Separate feature could be adding a nice new string type using UTF-8 and requiring the Unicode mode.
1
Replying to and
It really wouldn't be that hard to just turn JavaScript strings into UTF-16 in where you opt-in with either an equivalent to "use strict" (globally or not at all) or via document metadata like a header. It'd be nice to give it a modern immutable UTF-8 string type but not needed.
This Tweet was deleted by the Tweet author. Learn more
Replying to and
It shouldn't be possible for anything involving strings rather to end up containing invalid Unicode. With that properly enforced, most of the problems go away. When converting from arbitrary bytes, you can choose multiple approaches and it's already an existing choice.
1
Replying to and
This is already a serious problem whenever JavaScript has to interact with the outside world. Anything reasonable enforces valid Unicode for strings. Interoperability issues are entirely on the JS side and it's not everyone else's problem to support the screwed up legacy strings.
This Tweet was deleted by the Tweet author. Learn more
Replying to and
In the long term, JavaScript should have a correct string implementation. It wouldn't be that hard to make it an option today. It would be difficult to design a nice, modern string type for it but not to retrofit the existing one into an actual implementation of Unicode strings.
1