strncasecmp("ß", "ẞ", 2) = ?
-
-
This is a concrete example of the ambiguity that led to the Austin Group disallowing a UTF-8 (multibyte) C locale.
2 replies 1 retweet 1 likeShow this thread -
Replying to @RichFelker
Is multibyte the main issue or is it that ß is one code point but SS is two? It seems like you’d have the same issue if you had 32-bit bytes.
1 reply 0 retweets 0 likes -
Replying to @stevecheckoway
I chose ß just because it's the first char that came to mind where length in bytes of uppercase form ẞ differs. "SS" is a distraction, not relevant here. Choosing an example from some IPA-block form where opposite-case was added later would be less confusing.
1 reply 0 retweets 0 likes
The fundamental problem is that strncasecmp is specified in terms of n bytes of s1, but that could require inspecting considerably more than n bytes of s2, which is highly counterintuitive and can't be antisymmetric in s1,s2.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.