Is there a way to use the Unicode tables in the Rust regex crate to match a single char without turning it into a String or &str? It seems like the internals must have a way to do this, but perhaps not exposed? Is there a trick?
I would never have thought to turn a char into a &str that way! I wonder if it means we should add char as_str(&self) -> &str
-
-
Hmm. Don't think that would work. You would need to return an array, but we don't have a type for "fixed size array whose contents are guaranteed to be UTF-8."
-
I mean once you have the &[u8] can't you read it into a &str? Where would the unsafety come from?
-
I don't think you can get a &[u8] for anything other than ASCII, since the in-memory representation is different between `char` and UTF-8. Would have to be owned
-
You could read the data into a new fixed size UTF8 stack array (doing the conversion) and then read it into the &str.
-
What is the lifetime of your &str? Where does it point to?
-
Couldn't you stack allocate another four bytes (or 8 if needed) and unsafely write into it after validating?
-
I would suggest trying to write the code. The nature of Twitter prevents me from understanding where you've gone wrong. :-)
- 1 more reply
New conversation -
-
-
It would have to be &mut char -> &str and overwrite the four bytes of char storage with the UTF-8 encoding … which is only safe while UTF-8 is prevented from having a fifth or sixth byte. But then you'd have a char binding in an invalid state that can still be accessed…
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.