The bottleneck in fine-grained multithreading is the cost of interlocked operations like LOCK XADD, LOCK OR, etc. In the case where code doesn't depend on the opcode's result or flags, couldn't these be made free by extending CPU write buffers to support GPU-style ROPs?
-
-
This is where per-thread work queues are a win, along with task stealing when a thread is out of tasks.
-
Unless you’re talking about the task themselves. In which case there’s not much to do other than changing the algorithm.
- 4 more replies
New conversation -
-
-
But we do not really need locks. They are such a power draw. Clock based UUIDs and OP queues may be better.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.