This. I earned a reputation at my team at Google for debugging gnarly bugs. It was one of the most fun parts of the job. You've all seen that Go runtime bug story on my blog, which was after I left, but the stories inside were fun too. Highlights include: (thread)https://twitter.com/erincandescent/status/1281290723443294208 …
-
Show this thread
-
Replicated and cross checked process randomly fails due to an *offset* (not data) mismatch between both replicas, on the order of once a petabyte. Cause: Google's internal optimized zlib implementation had a buffer overflow read that very rarely caused correct but *longer* output
1 reply 2 retweets 23 likesShow this thread -
Service crashes upon receiving a protobuf with wrong schema. Root cause: producer server had kernel paniced, data not flushed to disk (but metadata was). On reboot picked up garbage sectors, happened to contain valid data from another process, format is self-synchronizing.
1 reply 2 retweets 14 likesShow this thread -
Bonus points for debugging that one entirely from logs and records of a single occurrence (and concluding beyond reasonable doubt that that's what had happened based on stuff like suspect offsets lining up on 512 byte boundaries), plus sec impact (data was from another user).
2 replies 1 retweet 16 likesShow this thread -
There was also another Go bug after I left, which someone said was breaking Google production. Cgo callbacks back into Go would transiently break stack tracing after returning, causing the garbage collector to explode if it ran with the right timing. Repro was tricky.
1 reply 1 retweet 13 likesShow this thread -
Can't get away from fixing Google production even after you leave
1 reply 1 retweet 21 likesShow this thread -
golang/go#7978 for the Cgo issue, #20427 for the vDSO stack issue (the one from my blog). I don't have internal Google bug numbers for the others if that's what you're asking for, I no longer work there.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.