Conversation

So I don't want to sound alarms prematurely, here, but we could possibly be looking at the first case of an AI pretending to be stupider than it is. In this example, GPT-3 apparently fails to learn/understand how to detect balanced sets of parentheses. (1/10.)
Image
Quote Tweet
My half serious attempt at testing if GPT-3 can detect if a set of parentheses is balanced. I don't have access to the beta, so I'm using ai dungeon as a backdoor which makes the prompting a bit... peculiar. pastebin.com/694Hy7tU
Show this thread
27
376
Now, it's possibly that GPT-3 "legitimately" did not understand this concept, even though GPT-3 can, in other contexts, seemingly write code or multiply 5-digit numbers. But it's also possible that GPT-3, playing the role of John, predicted that *John* wouldn't learn it.
3
91
It's tempting to anthropomorphize GPT-3 as trying its hardest to make John smart. That's what we want GPT-3 to do, right? But what GPT-3 actually does is predict text continuations. If *you* saw John say all that - would you *predict* the next lines would show John succeeding?
2
76
So it could be that GPT-3 straight-up can't recognize balanced parentheses. Or it could be that GPT-3 could recognize them given a different prompt. Or it could be that the cognition inside GPT-3 does see the pattern, but play-acts the part of 'John' getting it wrong.
5
74
The scariest feature of this whole incident? We have no idea if that happened. Nobody has any idea what GPT-3 is 'thinking'. We have no idea whether this run of GPT-3 contained a more intelligent cognition that faked a less intelligent cognition.
4
154
Now, I *could* be wrong about that last part! could be storing a record of all inputs and randseeds used in GPT-3 instances, so that they can reconstruct any interesting runs. And though it seems less likely, could somehow have any idea what a GPT-3 is thinking.
2
46
...answering merely by showing that, despite a lot of other attempts at prompting under more flexible circumstances, GPT-3 could not learn to balance parentheses as complicated as those tried by Breitman. (Which does answer the question, but in a less interesting way.)
2
36
If can't claim that bounty, I encourage them to develop tools for recording inputs, recording randseeds, and making sure all runs of GPTs are exactly reproducible; and much more importantly and difficultly, getting greater internal transparency into future AI processes.
5
84
Regardless, I unironically congratulate on demonstrating something that could plausibly be an alignment failure of this extremely-important-in-general type, thereby sharply highlighting the also-important fact that now we have no idea whether that really happened. (END.)
9
136
ADDED: Dear people: OpenAI did not do a bad thing by showing us this. They did not make this be a problem for advanced AI. It was always going to be a problem. They demoed it so you could see it, and showed that it happens surprisingly early. Don't shoot the messenger.
2
96