[1/2] Does Alexandrescu have an explanation somewhere? That slide set doesn't really talk about it. I am having a very hard time convincing myself that there is a practical scenario where a[++i] compiles to a pipeline stall but a[i++] does not...
-
-
Replying to @cmuratori @jqgregory
[2/2] I can believe some compilers would better handle a[0] a[1] a+=2 than a[i++] a[i++], but a[++i] vs. a[i++] I can't really think of how the compiler would reliably produce more stalls with one than the other.
1 reply 0 retweets 0 likes -
Replying to @cmuratori
Preincrement introduces a data hazard, because you need to wait for ++a to be calculated before it can be used. Whereas the value of a++ can be used immediately, while the increment op makes its way thru the pipeline sans data hazard.
1 reply 0 retweets 3 likes -
Replying to @jqgregory @cmuratori
Sorry, ++i vs i++ (not ++a/a++). It matters because the very next thing you do is to use i to index into a[]. No useful work can be done in between. So with preinc you have to wait for the result of the inc to pop out the other end of the pipeline before indexing into a[].
1 reply 0 retweets 0 likes -
Replying to @jqgregory
Sorry to keep beating this particular horse, but, I still don't get it. What platform are we talking about? This is not true on x64, right, for several reasons, not the least of which being that memory addresses have built-in offsets which execute in the same cycle as the load.
1 reply 0 retweets 3 likes -
Replying to @cmuratori @jqgregory
Separately, the add instruction is single-cycle and on at least three ports (for Intel anyways), so if the compiler does decide to preincrement the value, it's hidden by the load of the base address anyway.
1 reply 0 retweets 1 like -
Replying to @cmuratori @jqgregory
So if somebody has an example of some actual C code someone might write in the real world, where a pre-increment actually compiles to a cycle stall that would disappear if you switch to post-increment, I would like to see it so I can see what they're talking about.
1 reply 0 retweets 1 like -
Replying to @cmuratori @jqgregory
My best guess is that this was not advice from modern processors - maybe it is based on older processors that had less IPC?
1 reply 0 retweets 2 likes -
Replying to @cmuratori @jqgregory
Here's the talk
@incomputable gave associated with that presentation. Minute 18 is where he talks about it. So, yeah, he also stated data dependencies. This was back in 2012 btw.1 reply 0 retweets 1 like -
It's clear you "get it" perfectly

but yeah it does seem like it's a moot point on today's Intel CPUs. I'll verify, and if so I'll remove that section in the next edition. It's confusing at best, and the example is wrong regardless. Thanks for bringing it to my attention.1 reply 0 retweets 6 likes
You could put something like "on in-order CPUs", or "could create a pipeline stall on some hardware", or something? Because I could easily see this sort of thing happening on two-pipe in order CPUs like Pentium and Cell (a[i++] is one cycle paired, a[++i] is two cycles).
-
-
Replying to @cmuratori @jqgregory and
Obviously those CPUs aren't particularly relevant currently, but it's an interesting thing for people to understand and it _could_ come back again, depending on how processors evolve :)
0 replies 0 retweets 3 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.