so I learned that x64 specifically doubles the number of XMM registers (from SSE). and AVX-512 doubles that again, but only in x64 mode, I think? so if you want I assume you can write AVX-512 code which is cursed because it's in x86 and has a mere 8 registers to play with
-
-
MMX doesn't do F32. 8, 16, and 32-bit integers.
-
Have we forgotten 3Dnow! so quickly?
- 12 more replies
New conversation -
-
-
huh, even for SSE registers?
-
There's no f32x2 ops in SSE except for load/store. You can simply ignore the high-half to simulate it, however (so long as you don't care about flags raised if you divide by zero).
- 1 more reply
New conversation -
-
-
even funnier considering that original SSE was HW F32x2, double-pumped. (Exact same width as 3DNow, very similar impl.) But defining it as 4-wide even if the impl was 2-wide turned out to be a really good idea.
-
if you look at how x86 FP SIMD datapaths are implemented it's this bizarre nesting doll situation
- 5 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.