What scripts break our Latin-1 assumptions in Unicode? I usually check against (Arabic|Hebrew), (some Indic script), Korean, Han, emoji.
have you seen https://github.com/jagracey/Awesome-Unicode … it's one of the best references I know of.
-
-
Ah, turkish Is and ß for casing are a case I forgot.
-
casefolding-to-ASCII is a good one to remember.
-
JS had a bug where [\w\W] in regexes had missing chars due to casefolding-to-ASCII mistakes.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.