Conversation

Diagnosed a bunch of task threading problems from stacked multithreaded reactors in a rust process. Dropped cpu usage by two orders of magnitude (doing the exact same application level work). Rust really needs a portable unified model for fd wakers, and for sync job pooling.
2
4
The big issue here is not that this can happen, it's that the abstraction is so tall that it requires expertise not to end up here, and yet more expertise to identify it and get out. It's all very well roflscaling a microbenchmark, but what are users deploying in the field?
2
1
Replying to
This is of course not a Rust specific problem. Expensive resources (things that can make things much slower) need to be visible and preferably explicit in high performance scenarios.
1
Replying to
This isn't even a high performance scenario, and that's another thing here: one of the reactors always spawns a thread per core, which is fine on your MacBook, but not ok on my NUMA system with >64 cores.
1
1
Replying to and
And it's true that NUMA scheduling problems aren't also "a rust problem" per se, but, these default choices matter. Fundamentally epoll doesn't scale and it's scaling cliff is rapidly approaching common deployment scenarios. Too much of this ecosystem derives epoll architecture
1
1
Replying to and
I totally agree, scaling your average socket server is a relatively well known problem (at least there are lots of docs and experts now). The waker scheduling domain in this example is more messy though, as the events are spread across multiple multi-threaded reactors...
1
.. so you can get some of the biasing back if you write a scheduler yourself (as most of the super modern scalable storage engines do, for example), and that solves part of the problem, but you're taking on the OS job, and you're down on tools - at some point you fight balancing
1
Show replies