Given how conditional moves are kind of annoying uArch-wise, I'm wondering why integer min/max are still a rarity in ISAs. They cover a good fraction of the use cases and are definitely easier to implement.
-
-
Replying to @rygorous
quite often cmp + jge (or whatever) is faster than cmov.... the branch predictors can do the first one but often not the cmov. min/max have the same issue, no branch prediction so speculation stops due to data dependencies.
1 reply 0 retweets 0 likes -
Replying to @fenruspdx
Branch predictors can't predict any CMOVs! They predict branches. A compare+CMov sequence has a Cmov (often 2 cycle latency/multi-cycle issue) depending on a compare. An integer min/max is relatively easily built to be the same cost as a single compare. (Without the branch.)
1 reply 0 retweets 5 likes -
Replying to @rygorous @fenruspdx
[The main question here being whether you have enough picoseconds left in a single cycle for the fanout buffers and select mux, or whether it needs to be a 2-cycle op; but either way you can still fully pipeline it.]
2 replies 0 retweets 3 likes -
Replying to @rygorous @fenruspdx
Cmovs kinda suck: archs without flags need more than two sources (you're selecting between two things, but you still need at least one more source to tell you what to select), OoO archs with flags still need to track 3 dependencies for that their uops (unlike other integer ops),
2 replies 0 retweets 2 likes -
Replying to @rygorous @fenruspdx
Extremely ISA Designer Voice: "We'll give you a Positive Difference instruction; you can implement max as a + pd(a,b), and high-performance cores can do uOp fusion."
3 replies 0 retweets 6 likes
I’m stealing that “Extremely ISA Designer Voice” phrase
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.