Periodic reminder to never use -O3 unless you've already inspected the assembly generated at -O2 or -Os and are solving a specific issue and are committing to continue verifying it for each new compiler.https://twitter.com/iblueconnection/status/1201485834828091393 …
-
-
If I do that it’s surprising how often my vector code ends up not only faster than the equivalent scalar code but also clearer.
-
FWIW floats are the easy case still; most of my SIMD work is on (narrow) ints. Autovect is almost completely useless with that, but it's also annoying library-resistant if you target multiple archs because there's substantial divergences (both in what exists and what's fast).
- 3 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.