So in the godbolt I showed a -O0 and a -O2 of an inner product, and hopefully you can see that -O0 is _not at all_ similar to the C code - it's not even usable. -O2 produces the _actual_ closest translation to the input C code!
-
-
This Tweet is unavailable.
-
This Tweet is unavailable.
-
This Tweet is unavailable.
-
Replying to @Jonathan_Blow @cmuratori and
That’s a straw man argument because that was never my assumption. My assumption is that the compiler authors know everything about CPUs and nothing about my algorithm.
1 reply 0 retweets 0 likes -
This Tweet is unavailable.
-
This Tweet is unavailable.
-
Replying to @Jonathan_Blow @cmuratori and
You’re not really trying to tell me that getting rid of vector is going to get me an order of magnitude speed up, are you? What is your advice on where I should start to get that 10x speed up?
4 replies 0 retweets 1 like -
Replying to @stefan_3d @Jonathan_Blow and
Well we are telling you that removing std::vector is _part_ of the process of getting a 10x speedup. It certainly _could_ result in a 10x speedup alone, in a rare situation, but usually it's just one step of many that you must do.
2 replies 0 retweets 6 likes -
This Tweet is unavailable.
The cache problems were already discussed by @maxliani on this thread.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.