I’ve now implemented an all-GPU-compute pipeline in a Pathfinder branch, avoiding the rasterizer entirely (CPU still used for tiling). However, so far it seems that using compute shader to create vector tiles and the standard GPU rasterizer to composite them is fastest…
-
Show this thread
-
This makes sense to me: hard to beat the HW rasterizer for blending. But what’s strange is that I’ve seen numbers that show compute being 40x faster than what I’m seeing on comparable hardware, and I can’t reproduce them…
4 replies 2 retweets 10 likesShow this thread -
Replying to @pcwalton
What does the HW rasterizer have to do blending? Producing a launch order of threads you believe to result in a better access pattern?
1 reply 0 retweets 1 like -
Replying to @jhaberstro
My understanding is that on desktop it has some kind of internal HW queue of fragments coming from the shading units to reduce global memory traffic?
3 replies 0 retweets 0 likes -
Replying to @pcwalton @jhaberstro
I just saw your code. You're computing blending all tiles in a single thread. When you use the rasterizer, the pixel shader runs computeCoverage() in different threads, then performs the final blending stage serialized. [1/2]
1 reply 0 retweets 1 like -
If you want to beat the ROP, you need N threads to call computeCoverage() and store result into shared memory, then a barrier, then merge the results using a parallel sum algorithm and have 1 threads store the final combined value [2/2]
2 replies 0 retweets 0 likes -
Replying to @matiasgoldberg @jhaberstro
Yeah, I’ve thought about doing it that way. But other compute-based vector rendering solutions I’ve seen don’t work this way; they interpret command lists sequentially per tile.
3 replies 0 retweets 0 likes
To be clear: I’m definitely inclined to believe that you’re correct and that you have to process overlapping pixels in parallel if you want to win against the rasterizer. It matches what I’ve seen. I’ve just heard conflicting evidence, that’s all
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.