Conversation

The first reason GPT-3 is bad at reversing words is due to tokenization: it doesn't see letters and words as humans do. Instead, it sees "tokens."
5
132
Tokens are chunks of characters. For example, the word “alphabet” gets broken up into the tokens “alph" and "abet”. Short and common words like “the” are a single token.
2
89
We can make GPT-3 "see" the letters by having it add spaces between them. Then each letter becomes its own token: " a", " l", " p" and so on.
Image
1
82
But it's still not great at reversing words, because it gets "confused" by the ordering in long sequences. This is the second reason GPT-3 is bad at reversing words.
Image
2
80
We can teach it to order characters by having it add numbers to keep track of the letters in the sequence. This simple trick makes the ordering explicit, and GPT-3 now manages to reverse the letters correctly.
Image
1
177
So close! It manages to reverse the letter sequence, but fails at removing the spaces in the last step. It's the tokenization striking again.
1
85
GPT-3 has trouble removing the spaces and merging tokens. You can teach it to break down the task into steps. In each step, you remove every other space: "a l p h a b e t" becomes "al ph ab et", which becomes "alph abet", and finally "alphabet".
3
74
And that's it! 🎉 GPT-3 correctly reverses long words! But to get there, we had to teach GPT-3 the algorithm to use to get around its limitations. 💪
Image
11
303
Compare this to humans: We learn similar techniques to get around our limitations. 👨🏻‍🎓 I'm pretty bad at adding large numbers. But I can do it with pen and paper. By writing it out, and reasoning through the steps, I can get around the limitations of my brain.
Image
3
161
The takeaway here is that large language models like GPT-3 might be smarter than we think. You wouldn't expect humans to solve hard problems without breaking them down and reasoning through the steps. We're just getting started at seeing what these models are capable of.
4
276