This. I earned a reputation at my team at Google for debugging gnarly bugs. It was one of the most fun parts of the job. You've all seen that Go runtime bug story on my blog, which was after I left, but the stories inside were fun too. Highlights include: (thread)https://twitter.com/erincandescent/status/1281290723443294208 …
-
-
Service crashes upon receiving a protobuf with wrong schema. Root cause: producer server had kernel paniced, data not flushed to disk (but metadata was). On reboot picked up garbage sectors, happened to contain valid data from another process, format is self-synchronizing.
Show this thread -
Bonus points for debugging that one entirely from logs and records of a single occurrence (and concluding beyond reasonable doubt that that's what had happened based on stuff like suspect offsets lining up on 512 byte boundaries), plus sec impact (data was from another user).
Show this thread -
There was also another Go bug after I left, which someone said was breaking Google production. Cgo callbacks back into Go would transiently break stack tracing after returning, causing the garbage collector to explode if it ran with the right timing. Repro was tricky.
Show this thread -
Can't get away from fixing Google production even after you leave
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.