Wrote lots of SIMD for only a 10% overall speedup. Curse you, Amdahl's Law! (I'm not despairing--the code in question parallelizes across cores beautifully--just mildly annoyed.) :)
-
-
Right -- with a GPU predication layer, the inefficiency can be worth it, but SIMD not || enough.
-
It’s parallelizable across paths, which is sufficient to get 100% CPU utilization. I’m planning on trying GPU compute later. I suspect whether it’ll be a win over CPU will be hardware dependent. It’s really not an ideal problem for GPU either.
- 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.