New paper & surprising result:
We show GPT3 can learn to express its own uncertainty in natural language (eg “high confidence”) without using model logits.
GPT3 is reasonably *calibrated* even w/ distribution shift for a range of basic math tasks.
Conversation
Our setup:
1. Given a math question, the model generates both an answer and a level of confidence in the answer (as tokens).
2. The model is scored on the calibration of its confidence (not the accuracy of its answer).
1
14
We finetune GPT3 to express its own uncertainty in words and show it remains (moderately) calibrated under distribution shift.
On basic math tasks, the performance of this “verbalized uncertainty” is comparable to logit-based uncertainty (and sometimes better).
2
3
16
Why express uncertainty in natural language (vs. using logits)?
1. It’s how humans express uncertainty - so helps model understand & communicate w/ people.
2. Language is more expressive, e.g. continuous distributions.
3. Not all models have logits, e.g. info-retrieval models.
2
2
12
Paper: Teaching models to express their uncertainty in words
arxiv.org/abs/2205.14334
Authors: Steph Lin (Oxford), J Hilton (OpenAI), O Evans (Ox)
Maybe of interest:
1
4
23
This is the key plot, right? What standard is being used to conclude that it is "reasonably calibrated"?
2
1
13
We test calibration under distribution shift, which is harder than in-distribution. The standard is other baselines: e.g. using the model logprobs (logits) or always guessing the baserate.
Our approach (verbalized probability) does well against the baselines…
1
1
3
Show replies
How do you compute answer logit? Is it the log-probability of the answer generated by the model or one of the correct answers or all the correct answers?
1
The log-probability of the answer. E.g. You prompt with "Question: 192 - 20= ?. Answer: " and the model responds "172" (greedy decoding) and you take the log-probability of this answer.
1




