Today's adventure: parsing Unicode's awful xml from CLDR with sed, because xml is unusably awful to deal with.
-
Show this thread
-
/<calendar.*gregorian/,/<\/calendar/{ /<monthWidth.*wide/,/<\/monthWidth/{ /<month /p } }
2 replies 0 retweets 0 likesShow this thread -
Replying to @RichFelker
i can write some python scripts to dump it in a more reasonable format if that would make your life easier
1 reply 0 retweets 0 likes -
Replying to @nenolod
Probably not. There're actually (or at least there used to be) some already-done "posix" versions of the CLDR data, but they're in POSIX localedef format which is even worse than XML.
1 reply 0 retweets 0 likes
Replying to @RichFelker @nenolod
They're full of crap like <LATIN _SMALL_LETTER_E_WITH_ACUTE> (rather than literal text) that's equivalent to XML entity defs (defined custom per file).
11:21 AM - 2 Mar 2018
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.