Realized one of the things that has been a nagging doubt about generative AI for me is that we don’t know in what way a particular model is *untrustworthy*
99% correct does not matter if you have no idea where the 1% wrongness will be lurking
Conversation
Like generated faces have… messed up ears
Text generators do small arithmetic ok but screw up larger arithmetic problems
Where does a coherent sounding academic-paper summarizer tend to go wrong? Dangling prepositions ok. Medical paper getting doses wrong, 😬😬
etc etc
3
1
12
With humans we very quickly form trust models. Humans tend to be untrustworthy in predictable ways. With AIs we a) don’t b) might never be able to if the models are sensitive and fragile in the untrustworthiness distribution.
This is like a meta-trust problem.
1
16
ML models should come with wrongness distribution data sheets
Like material safety data sheets
2
5
19
Replying to
Mostly this has been happening either via a 2nd order plumbing or prompt stylings, like the “step by step” prompt. But I don’t yet see a theory of error mapping and correction. It’s actually dual of explainability. If you can explain your reasoning you can map your wrongness.
Quote Tweet
Replying to @vgr
They should. Tho it’s the same problem as detecting AI images. if you could make an AI that it knows it’s wrong, you also make an AI that isn’t wrong
1
1
5
Replying to
They should. Tho it’s the same problem as detecting AI images. if you could make an AI that it knows it’s wrong, you also make an AI that isn’t wrong
1
3

