I liked UTF-8 more before I started to think hard about efficient RISC-V BitManip code for encoding/decoding it.
-
-
Just finished the decoder. I'm reasonably happy with the result. Also 18 instructions per code-point, but no tables needed for this one.pic.twitter.com/b0qOcYmED3
-
How is it with edge cases? overbyte: c0 80 e0 80 80 f0 80 80 80 f8 80 80 80 80 fc 80 80 80 80 80 cesu-8: ed a0 bd ed b1 86 >10FFFF: f4 90 80 80 5 byte nonsense: f9 84 80 80 80 split surrogates: ed a0 80 ed bf bf overbyte split surrogates: f0 8d a0 80 f0 8d bf bf
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.