Conversation

Infrastructure people: Do you systematically track and investigate “glitch states”, or wait for them to actually turn into failures/error states? Glitch state as in: something feels vaguely troubling in peripheral vision around X but it’s working ok for now. Disturbance in force
Replying to
I have a list I maintain called “glitch list” I just monitor it. It’s mostly things that might break under some stressor. Like say an ambiguous tax thing that may or may not be an irregularity that may or may not break at tax time. Known fragilities/cracks that may propagate.
12
Replying to
almost anyone with experience with computers does a fair amount of this, even superstitiously. interested in seeing responses, maybe really good infra people track this explicitly
6
Replying to
I try more and more to pursue and understand these states when I get the vague feeling something is off, seen too many instances in software of ‘hmm that seems wrong, could be a problem maybe?’ turning into ‘okay we’ll that thing I didn’t fix/investigate broke the world.
3
Replying to
in aviation there are cyclic inspections to help surface and devote time to fixing such things. "every X days, you have to open this engine and do a top to bottom inspection, fixing anything you find asap", etc
1
Replying to
Systematically? No. There are pre-errors, though. Equipment in high (or low, or off-) normal, or something during routine inspection was found that doesn't meet reporting standard but could develop into a reportable incident.
1
Replying to
I don't explicitly track them generally but do very much feel them. A lot of infrastructure work is sharply limiting the number of states code can be in so glitches become impossible (as only explicitly defined states/transitions are allowed).
1
2