I'm also considering disallowing newlines in file names. I can't think of a good use. Their existence in file names is why some tools produce/consume \0 delimited records, but many tools just do newlines (especially portability). Forbidding them makes things a lot simpler.
-
Show this thread
-
Interoperability is a key concern. What if I mount such a filesystem? I could try to 'correct' the file name (drop or fix chars), have a fallback representation (hexencode). I could move the inode to /lost+found as corrupted or return EIO.
4 replies 0 retweets 2 likesShow this thread -
Replying to @sortiecat
IMO, the best approach is to have a fallback representation, and maybe do charset transcoding on mounted filesystems.
2 replies 0 retweets 0 likes -
Replying to @pikhq @sortiecat
You already need to handle something like this if you want to mount FAT, after all, since the only charsets there are legacy and UTF-16.
1 reply 0 retweets 0 likes -
Replying to @pikhq @sortiecat
s/UTF-16/UCS-2 with illegal code values in the range D800-DFFF allowed/ (the FS makes no requirement that they be in pairs that are valid as UTF-16).
2 replies 0 retweets 0 likes -
Replying to @RichFelker @sortiecat
Oh, right. Either UCS-2 or potentially-invalid UTF-16, depending on how you describe it.
1 reply 0 retweets 0 likes -
The fun part is that countries like Japan widely adopted a mutibyte encoding long before utf8 was invented, and have stuck with them.
4 replies 0 retweets 0 likes -
Japan is pretty much the only partial holdout, and decreasingly so. Shift_JIS presence on the web dropped by ~50% between 2014 and 2018.
1 reply 0 retweets 0 likes -
Because of the great firewall of china, you mean?
3 replies 0 retweets 0 likes -
Shift-JIS isn't used in China; GBK, GB2312 and UTF-8 are. GB2312 is a Unicode transformation that does to GBK what UTF-8 does to ASCII.
2 replies 0 retweets 0 likes
You mean GB18030. GB2312 is the standard base that the GBK (nonstandard Windows extensions) was built on.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.