Periodic reminder that if you need a standalone UTF-8 decoder in C without all sorts of bugs and conformance problems, you want this: https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ …
-
-
Most of the time with byte- or character-at-a-time UTF-8 decoding, the performance bottleneck is the interface to the decoder core, so ease of changing that is a big feature.
-
I took a look at
@rustlang's implementation https://doc.rust-lang.org/src/core/str/mod.rs.html#336 … and I suspect it could be improved by this, but I didn't implement and test. (Also, I think it has two bugs in next_code_point.)
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.