GPU twitter: on Metal, Iris Plus 640, I'm seeing roughly 4x bandwidth for fragment shader writes to render target (~10Gpix/s) vs texure2d.write(~2.5Gpix/s). Is there a good explanation for this?
I doubt it; my compute shader is running with full SIMD utilization (I checked). I also tried byte writes (doing unorm conversion in ALU), same bandwidth.
-
-
Its a clear factor 4 so, must be something with a factor 4. Like a float32 vec4 vs a u8 vec4.
-
New hypothesis: Metal is allocating the render target buffer in eDRAM, but not the MTLTexture objects I request. A way to test this hypothesis would be to try on a machine that doesn't have eDRAM.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.