Can someone quickly estimate the max information content of a tweet in bits? (assuming unicode encoding and no links/images)
.@ronwood007 Not a trick question, that seems right. Just wanted to confirm that's the way to estimate it.
-
-
Twitter formerly used UCS-2, but now counts each codepoint as a "character". There are 1,114,112 codepoints, however...
-
Most are tweetable, but not: * The 2048 surrogates * FFFE FFFF FEFF * \x00-\x1F And NFC (normalization) prevents some...
- 6 more replies
New conversation -
-
-
Wait - a char isn't simply 16 bits - it's UTF-8 which keeps expanding! "Codepoints" FTW. https://dev.twitter.com/basics/counting-characters …
-
But yeah you want max theoretical bits per UTF-8 (which is around 1 billion IIRC, which I almost certainly don't) * 140.
- 4 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.