more random thought: I've had issues with bad RAM two times in my life. it takes me some time to get from "huh, my browser segfaults pretty often, that seems weird" to "maybe I have bad RAM?". if Linux had support for scanning for bad RAM on segfault, that'd be cool
-
Show this thread
-
Replying to @tehjh
Hrmmm, I like the idea but it also seems like a predictable response, using core resources, that could be exploited. Can I intentionally SEGV and impede other resources? Can I learn anything from imposing delays on other processes outside of my address space :)
1 reply 0 retweets 4 likes -
Replying to @dotMudge
attack ideas I can come up with against a naive implementation: - you could perform eviction from the page cache more efficiently - might be useful for rowhammer - you could temporarily steal memory from the system (effectively a DoS) but I think these are avoidable problems
1 reply 0 retweets 2 likes -
you'd probably want to apply some ratelimiting - if a bunch of crashes happen and the first few aren't due to bad RAM, it's probably not very useful to keep scanning. and you could perhaps make sure that if the system is under memory pressure, unscanned memory is given back
2 replies 0 retweets 1 like -
Having worked for a while on memtest86 I can assure you that testing if modern RAM is bad is difficult. The easy cases rarely make it out of the factory. What you are left with are difficult cases like burst transfer errors. Using ECC is a much easier solution.
1 reply 3 retweets 17 likes -
interesting. the cases I've had (well, at least the ones I noticed) were really blatant - one specific bit, or multiple bits, of RAM being stuck to a fixed value
2 replies 0 retweets 0 likes -
If your concern is stuck bits I would suggest writing a boot time test to find and quarantine them. KISS. I am surprised you had systems that would even boot with a stuck bit.
1 reply 0 retweets 0 likes
oh, actually, Linux has that already... I should probably turn that on. "memtest=2" on the cmdline
-
-
but a boot-time memory test is probably not something a distro would turn on, because of the impact on boot time... which is why I was thinking that only doing it when a segfault occured, only on the process' memory, might be more likely to be turned on in default configs
0 replies 0 retweets 1 likeThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.