Conversation

Replying to and
I'm optimistic JavaScript will get proper strings because WebAssembly is UTF-8 focused and they're making good decisions about how interoperability works with JavaScript despite it upsetting people invested in UCS2. Python switched from UCS2 to UCS4 which was a dumb decision...
1
2
Python took that same multiple string representation approach and browsers did the same for JavaScript. The strings are always fixed size units in Java / JavaScript / Python and they bloat themselves up to a larger unit if you add a single character from the range requiring it.
2
The funny thing is I think I've never actually used code that truly needs to know that character index in a string truly is the beginning of a real character. strcmp doesn't care. strlen doesn't care. strdup doesn't care. What cares, aside maybe from font rendering code?
1
A lot of the common functions treat the string as nothing more than an array of elements (uint8_t, uint16_t, uint32_t) and doesn't really need to know what they correspond to individually. Maybe one exception would be uppercasing/lowercasing because it needs the real "character"
1
Show replies