Ok so a n00b at compute shader optimization, like me, would ask: is it really that optimization/profiling tools are, uhm, "a bit lacking" there? E.g. here I have a PC with NVIDIA GPU. NSight shows "yeah you're running at 30% of speed-of-light, good luck".
The trick is that you still have to use the same values, or else it will optimize out too much and give incorrect deltas. It can be useful to think of your shader like a node tree of calculations (think UE4 material editor.
-
-
Then you prune a branch and replace wth a reasonable constant. Then you know the cost of that branch. It’s messy, inexact, and you have to write your shader a certain way, but it’s one method I’ve used in the past with reasonable results.
-
This likely won’t give you a good view since SGPRs and VGPRs (+ LDS with compute) affect occupancy. Ex, at the low occupancy end, moving from 129 to 128 VGPRs can nearly double the performance (halve the time) of a program’s execution.
- Näytä vastaukset
Uusi keskustelu -
Lataaminen näyttää kestävän hetken.
Twitter saattaa olla ruuhkautunut tai ongelma on muuten hetkellinen. Yritä uudelleen tai käy Twitterin tilasivulla saadaksesi lisätietoja.