How many times did you test a set of chips from a vendor and see that outcome? I'm guessing zero? The broader point is that testing with no model of the underlying process is going to lead to testing things that aren't informative at all.
-
-
For a conventional branch predictor on a semi-modern (P6+ era) out of order processor, it would be pretty surprising to have the critical path run through the branch predictor. Ofc. a metal fix for could change this as could an unconventional design, as well as other things.
1 reply 0 retweets 0 likes -
Maybe you get "lucky" with some chip and there's some extreme manufacturing variation that causes this failure mode, maybe the wafer was bad in some particular way, etc.
1 reply 0 retweets 0 likes -
testing a set of chips from one uarch form one vendor and having it pass some particular overclocked test (where passing means not corrupting the EIP while not crashing?) tells you as much about another uarch from the same vendor as it does about a chip from another vendor.
1 reply 0 retweets 0 likes -
It doesn't even tell you much about another stepping from the same chip (and vendors will sometimes release revisions without updating the stepping since OEMs complain when you officially change the stepping).
1 reply 0 retweets 0 likes -
Ok, so now you've tested maybe 10 chips from 10 different models of Intel chips and none fail in this way, well that's the expected outcome, what does that tell you about another model from Intel? Same thing it tells you about a Transmeta chip you didn't test.
1 reply 0 retweets 2 likes -
Some a priori knowledge you have is that Intel has * high complexity in their chips compared to every non-white listed vendor * cut back on verification effort in the past decade So even after all that testing, you should still expect a greater chance of funny failures on Intel.
1 reply 0 retweets 1 like -
Someone who's following along on this discussion pointed out https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html …, this isn't the only time Intel has had a bug like this. You couldn't even get a bug like this on a Centaur chip, the logic isn't sophisticated enough.
1 reply 0 retweets 3 likes -
So after testing 1k chips from Intel, what do you know? If you have some knowledge of CPU internals and talk to Intel verification folks, your prior (IMO) should still be that they're the most likely to have this kind of issue.
1 reply 0 retweets 0 likes -
After testing 1k chips, if all 1k failed under load, I would say we need to insist on a more specific cpuid whitelist, maybe consider design changes, and get feedback from the vendor? I really think this is pretty reasonable.
1 reply 0 retweets 0 likes
If 1k out of 1k failed under load, the chip never would've shipped in the first place, you're never going to see that. The linked bug is contemporary with the NaCl thread. How come the CPU vendor *most likely to have issues like the one you describe* is one of two whitelisted?
-
-
If by under load you mean not overclocked but under load. And as I'm sure you know, there was a fairly serious CPU correctness bug found by your employer during that timeframe, unsurprisingly against a whitelisted vendor.
1 reply 0 retweets 0 likes -
What percentage of chips couldn't handle lock cmpxchg8b eax? We're not talking about well formed code, we're talking about adversarial code from the web, designed to trigger edge cases or exercise obscure parts of the spec. Not sure spectre was relevant to NaCl, rowhammer was.
1 reply 0 retweets 0 likes - 3 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.