Trying to optimize my OpenCL code for GPU. My "GPU-optimized" version runs a bit faster on CPU and much slower on GPU. Fail.
-
-
@marcan42 stuff like this total_pixels = bh * bw * ((h+bh-pad-1)//(bh-pad)) * ((w+bw-pad-1)//(bw-pad)) [...] -
@marcan42 and the variable trip count loops makes me as a compiler person REALLY wish those could be constants... - Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.