I’ve now implemented an all-GPU-compute pipeline in a Pathfinder branch, avoiding the rasterizer entirely (CPU still used for tiling). However, so far it seems that using compute shader to create vector tiles and the standard GPU rasterizer to composite them is fastest…
-
-
What does the HW rasterizer have to do blending? Producing a launch order of threads you believe to result in a better access pattern?
-
My understanding is that on desktop it has some kind of internal HW queue of fragments coming from the shading units to reduce global memory traffic?
- 9 more replies
New conversation -
-
-
I don't share your faith in hardware blending performance. It's a very challenging thing to implement even in hardware. In WR it makes a world of difference whether a primitive has blending enabled or not.
-
Well, right now I haven’t seen any improvements from avoiding the HW blender. In fact it’s a significant regression…
- 9 more replies
New conversation -
-
-
Something to do with caches perhaps? Or just memory latency dominating since blending is mostly memory bound and not doing much actual computing.
-
But if that were the case then I would be seeing similar improvements, since I implemented a rasterization-free pipeline.
End of conversation
New conversation -
-
-
What GPU/vendor are you running on? This is the sort of thing that I'd expect to vary wildly on different archs. I think it's worth a test on other vendors hardware. Also possible you are hitting a driver special-case for the HW ver. Could be worth working on compute more.
-
My MacBook Pro’s Radeon Pro 560 and Intel HD 630.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.