yikes! one of my all-time favorite computer stories (which I tell here at every opportunity) was attending a talk by a Cell architect. super impressive perf numbers of course. someone in the audience asked how people were supposed to program it.
-
-
-
he paused and said "any way you want"
- 15 more replies
New conversation -
-
-
While the PS3's tools weren't too great at the beginning, they became *fantastic* as time progressed. The debugger is still my favorite of all time, the performance tools were great (IMO, preferable to PIX).
-
Did you ever do the thing with the Cell superoptimizer where you submitted code at early hours of the morning so the servers wouldn’t be as loaded and it would have more time to run and would produce better code? That’s one of my favorite stories.
- 3 more replies
New conversation -
-
-
Sounds like something we were doing in the twilight days of the Larrabee graphics team, ~2011: we made an exhaustive-search instruction scheduler that would run on a server farm to try and find the best schedule for critical basic blocks (like rasterizers)
-
Seems like there’s enough material here between the PS3 GPU and Larrabee for a short “survey of superoptimizers in practice” article. Would be really interesting.
- 1 more reply
New conversation -
-
-
I'm going to get a little salty at this idea that random search is "superoptimization". The whole point of someone inventing that silly word was to distinguish the bowdlerized "optimizers" (really improvers) from things that are going for actually optimal.
-
I always thought "superoptimizer" was kind of a misnomer already, since it's usually just "the shortest local inst. sequence", without even considering stuff like execution ports, etc. Adding more global effects like cache pressure/common input data makes "optimizing" hard
- 1 more reply
New conversation -
-
-
Yup, that's for the pixel shaders (vertex shaders were very different). One important thing to note is that RSX pixel shaders weren't really running shader code on shader cores in the way that modern GPUs do; it was more like the last hurrah of register combiners.
-
Or rather, a weird hybrid thing between those generations (it did have a proper register file for temps, not just a few slots). But RSX shaders ran in "passes". One pass = one arithmetic op (e.g. for tex coord generation), then one texture access, then arithmetic on the result.
- 13 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.