Today's adventure: parsing Unicode's awful xml from CLDR with sed, because xml is unusably awful to deal with.
Probably not. There're actually (or at least there used to be) some already-done "posix" versions of the CLDR data, but they're in POSIX localedef format which is even worse than XML.
-
-
They're full of crap like <LATIN _SMALL_LETTER_E_WITH_ACUTE> (rather than literal text) that's equivalent to XML entity defs (defined custom per file).
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.