It'd be kinda awesome if we could have an io_uring with multiple submission queues/rings, but only one completion queue/ring. With the ability to map the submission queue only in one process, but the completion queue in multiple.
CC: @axboe
The slowdown for buffered IO is noticeable both when the data is in the page cache (because submission lock is held while copying all the data), and when not (the number of io-wq workers causing CPU overhead, partially due to contention).
-
-
For the read side, it should be no slower than read(2) eg if cached, but if you need locking held over the copy, then that's definitely a concern. For read side buffered IO, my goal is to make this work through page waitqueue callbacks and eliminate the need for io-wq.
-
Buffered reads via io-u that need to actually execute IO cause kernel-side contention, particularly around bringing the page into page cache. Re waitqueue callbacks: That sounds awesome.
- 6 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.