@egonelbre That said, even hardware technically does gamma wrong in premultiplied alpha :/
-
-
Replying to @cmuratori
@egonelbre At least as far as I understand it. I'm not exactly the world's leading gamma expert :)2 replies 0 retweets 0 likes -
Replying to @cmuratori
@cmuratori@egonelbre Do you think it is a compute or a storage/precision issue? Could approximate, e.g. gamma=23 replies 0 retweets 0 likes -
Replying to @won3d
@won3d@egonelbre Well, it depends whether we're float or S16 for compute. Obviously we'll be U8 for storage.1 reply 0 retweets 0 likes -
Replying to @cmuratori
@won3d@egonelbre But yeah, if we do a CLUT for gamma, then with AVX that's easy but with SSE it sucks (no wide lookups).1 reply 0 retweets 0 likes -
Replying to @cmuratori
@won3d@egonelbre It we do compute for gamma, then we can stay wide but we've gotta do a bunch of approximations, and is that a win?3 replies 0 retweets 0 likes -
Replying to @cmuratori
@cmuratori@egonelbre It is a win. Are you thinking of using gather for AVX CLUT? Because...don't.2 replies 0 retweets 0 likes -
Replying to @won3d
@won3d@egonelbre Well, I'm not thinking about doing anything in AVX, since we're SSE2 :) But I was just saying you could do that.1 reply 0 retweets 0 likes -
Replying to @cmuratori
@cmuratori@won3d@egonelbre Gather is currently like dpps. It looks like a cool instruction until you profile it.2 replies 0 retweets 0 likes -
Replying to @tom_forsyth
@tom_forsyth@cmuratori@egonelbre Was able to get AVX gather to help in one case, but it was tricky. Hopefully faster in Broadwell?2 replies 0 retweets 0 likes
@won3d @tom_forsyth @egonelbre It will be fixed in Gatherwell.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.