Create a random 500 MB file, read its contents using std::fs::File, and then do the same with tokio::fs::File. 2/9 dd if=/dev/urandom of=bigfile bs=1M count=500pic.twitter.com/hSdJVoFtCI
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
Create a random 500 MB file, read its contents using std::fs::File, and then do the same with tokio::fs::File. 2/9 dd if=/dev/urandom of=bigfile bs=1M count=500pic.twitter.com/hSdJVoFtCI
On my machine, std takes 0.238 sec and tokio takes 120 sec, meaning tokio is 500 times slower! Note that this is not some kind of obscure edge case - we're literally just reading a big file. 3/9
So why are async files so inefficient? Well... because reasons that are too boring for this twitter thread. The more important question to ask is: How can we be comfortable using async files if they can be 500 times too slow without even realizing it? 4/9
And performance is not the only pitfall of async files. Here's another - write some bytes into a tokio::fs::File. If you forgot to flush before dropping the file, some buffered data will be lost! In this example, the created file is empty. 5/9pic.twitter.com/PFJEsduJiB
Perhaps most surprisingly, async-std/tokio files are *always* slower than std files. They don't improve performance at all, quite the opposite - their only utility is in that they move blocking file I/O onto a dedicated thread pool. But you can also do that yourself anyway. 6/9
If you do want fast async files, consider using rio by @sadisticsystems instead. That is really the only way! 7/9
https://github.com/spacejam/rio
https://docs.rs/rio
My advice is to avoid async files in Rust altogether because they're inefficient and full of pitfalls (unless you use rio). Use them only in situations when you know what you're doing, if ever. Things might get different in the future, but that is what I advise *today*. 8/9
If you want to read or write files inside async programs, spawn a blocking task using async_std::task::spawn_blocking() or tokio::task::spawn_blocking() and do synchronous I/O in there. That's the only easy, efficient, and reliable way of doing file I/O today. 9/9pic.twitter.com/Fwg1rCd368
Is it inherent, or potentially fixable in the future?
io_uring might help, but that only covers Linux. AsyncDrop would help with flushing on drop, but that's nowhere near the horizon. The 500x slowdown can be fixed with larger buffers at the cost of higher memory use. It's all fixable in theory, but I'm not holding my breath for it.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.