GPU twitter: on Metal, Iris Plus 640, I'm seeing roughly 4x bandwidth for fragment shader writes to render target (~10Gpix/s) vs texure2d.write(~2.5Gpix/s). Is there a good explanation for this?
-
-
Replying to @raphlinus
Random guess: maybe related that a fragment shader runs in blocks of 2x2 pixels for its derivatives? Maybe some kind of special datapath there?
1 reply 0 retweets 1 like -
Replying to @rikarends
I doubt it; my compute shader is running with full SIMD utilization (I checked). I also tried byte writes (doing unorm conversion in ALU), same bandwidth.
1 reply 0 retweets 0 likes -
Replying to @raphlinus
Its a clear factor 4 so, must be something with a factor 4. Like a float32 vec4 vs a u8 vec4.
1 reply 0 retweets 0 likes
Replying to @rikarends
New hypothesis: Metal is allocating the render target buffer in eDRAM, but not the MTLTexture objects I request. A way to test this hypothesis would be to try on a machine that doesn't have eDRAM.
12:22 PM - 14 Jan 2020
0 replies
0 retweets
3 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.