If your starting assumption is that you are smarter than the compiler, then your code should run the fastest as-is, and any compiler optimization flags could only make it slower.
-
This Tweet is unavailable.
-
-
Replying to @stefan_3d @dgmdavid and
That's not at all how compilers work. If you do not turn optimizations on, it will store and load every value to and from the stack on every line. There is no way to write performance oriented code without turning optimizations on unless you use inline ASM.
2 replies 0 retweets 18 likes -
Replying to @cmuratori @dgmdavid and
Then how would you compile code that you think can't be optimized no more? -O0 with a few select flags at best. -O1 (at least on gcc) already implies things like -freorder-blocks, which is letting the compiler make decisions for you.
1 reply 0 retweets 0 likes -
Replying to @stefan_3d @dgmdavid and
Does this help?https://godbolt.org/z/6DAVpR
1 reply 0 retweets 2 likes -
Replying to @cmuratori @dgmdavid and
I know godbolt, yes. My question is though, how would you compile hand-optimized C if you wanted the compiler to not touch any bits of it other than faithfully translating it line by line to assembly?
1 reply 0 retweets 0 likes -
Replying to @stefan_3d @dgmdavid and
Did you read the actual specific godbolt I sent?
1 reply 0 retweets 0 likes -
-
Replying to @stefan_3d @dgmdavid and
I was trying to explain that what you're saying doesn't make any sense. There is no such thing as "translate line by line into ASM". There is no direct mapping from C to ASM.
2 replies 0 retweets 7 likes -
Replying to @cmuratori @stefan_3d and
-O2 _is_ the way you tell the compiler "try to produce the correct ASM for this C". Sometimes it can't figure it out (and it's frustrating). But if you write the C carefully enough, sometimes it can. -O0 _never_ figures out the correct ASM for the C code, pretty much ever.
1 reply 0 retweets 5 likes
So in the godbolt I showed a -O0 and a -O2 of an inner product, and hopefully you can see that -O0 is _not at all_ similar to the C code - it's not even usable. -O2 produces the _actual_ closest translation to the input C code!
-
-
This Tweet is unavailable.
-
This Tweet is unavailable.
- Show replies
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.