This GPU also stores all values as floats. Half-precision lets it use the f16 dot product unit, among other things.
-
-
yes. what i mean is that doing the naive compilation will not match the spec; you have to do extra steps to match it
-
Erm, if I understand it right you just implement double+double=double_t, flush those to stack, and only round when storing to double?
-
Conforming compiler has to spill doubles as double_t (long double for x86), convert to double only via explicit cast or assignment.
-
Eh? Opposite problem; it has to round to double after each op.
-
The bug can occur even if you never spill once.
-
-float-store is a buggy workaround for old gcc behavior that made stuff deterministic but worse/wrong.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.