@mathias How do you recommend executing a substring on a Tweet, using Twitter's display_text_range, that is unicode aware? Twitter counts every character (including emojis) as one character, therefore a simple String.substring wont work with non BMP.
-
-
Segmenting twitter text is a special class of nightmare. Codepoints U+0000...10FF count 1/280, most U+1100..10FFFF count 2/280, and RGI emoji sequences count 2/280 each. Their emoji segmentation is basically the entire RGI list turned into a regex.https://github.com/twitter/twitter-text …
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
JavaScript, HTML, CSS, HTTP, performance, security, Bash, Unicode, i18n, macOS.
, and [...string] doesn't account for that. It's very surprising that twitter-text doesn't account for this. Is there anyway to leverage Intl.Segmenter today?