On transformers:
Transformers are successful because they are a simultaneous combination of multiple ideas: attention, ability to run quickly on a GPU, non-recurrence (and therefore less deep so easier to optimize)
On self play:
“Self-play has a property that it can surprise us in truly novel ways.” (e.g., creative solutions to problems that weren’t anticipated — and are actually useful!)
On the goodness of people:
“I don’t know if most people are good, but I think that when it really counts people can be better than you think.” ♥️