Conversation

There's a rule of thumb that past 2-digit cores, the performance gains of additional cores do not materialize. This hints at OS-level bottlenecks, or at issues around cache & men bandwidth, as this seems to hold true for nominally horizontally scalable workloads.
16
61
Replying to
On Linux, SCHED_BATCH genuinely helps a lot for non-interactive workloads especially if there are more jobs than cores. It gets rid of a lot of the context switches and migrations. Also helps to tune the scheduler to stop caring so much about latency via /sys/kernel/debug/sched/.
1
4
Replying to and
I don't understand why github.com/GrapheneOS/pla isn't simply done by default in ninja/bazel. It's genuinely really nice. You can measure a decent little performance boost especially with 32+ cores and more jobs than cores. Feels noticeably more responsive on a workstation too.
1