AI and Compute: Our analysis showing that the amount of compute used in the largest AI training runs has had a doubling period of 3.5 months since 2012 (net increase of 300,000x):https://blog.openai.com/ai-and-compute/
-
Show this thread
-
We've analyzed compute used in major AI results for the past decades and identified two eras in AI: 1) Prior to 2012 - AI results closely tracked Moore's Law, w/ compute doubling every two years. 2) Post-2012 - compute has been doubling every 3.4 months https://openai.com/blog/ai-and-compute/#addendum …pic.twitter.com/ILN5MRrWYH
24 replies 298 retweets 683 likesShow this thread -
Replying to @OpenAI
Where would Deep Blue and Watson fit on this chart? They are - surprisingly - missing.
1 reply 1 retweet 15 likes -
Replying to @Grady_Booch @OpenAI
We plotted Deep Blue on this early on in the analysis, but took it out of the chart as these results are ones which predominantly use ML. I'm not sure if there was enough info in the Watson papers to do a compute calc.
2 replies 0 retweets 2 likes -
Here is the deep blue number & method, hope helpful!pic.twitter.com/iQVqN4GKhW
1 reply 0 retweets 0 likes -
Yep, note that this was an initial cut at deep blue and makes some assumptions re: utilization and “thinking time”. As jack said, for this particular analysis we looked at learning systems in particular
1 reply 0 retweets 2 likes -
Replying to @girishsastry @jackclarkSF and
Where did the 11 GF / sec number come from? IIRC, the papers says they could evaluate ~200 mill positions per second, and were calculating ~8,000 features per position. That makes it at least 1 TF / sec, and probably at least an order of magnitude higher.
2 replies 0 retweets 0 likes -
Replying to @michael_nielsen @girishsastry and
michael_nielsen Retweeted michael_nielsen
I checked the numbers. Deep Blue (1997) appears comparable to the 2012 systems:https://twitter.com/michael_nielsen/status/1192667888840142848 …
michael_nielsen added,
michael_nielsen @michael_nielsenInteresting to add Deep Blue to this graph. The 1997 system achieved ~8 Teraflops / sec, according to IBM's paper. I don't know how long it ran in the Kasparov match, but it was many hours, maybe tens of hours. So it's between 10^-3 and 10^-2 PF days, above DQN, below AlexNet. https://twitter.com/OpenAI/status/1192481690741903360 …Show this thread1 reply 2 retweets 3 likes -
A caveat: as a commenter points out, the Deep Blue numbers aren't Flops, but a somewhat different kind of operation. Still, it's pretty clearly in the 2012 ballpark.
-
-
Replying to @michael_nielsen @Grady_Booch and
Amusing: if you extrapolate back to Deep Blue's 1997 performance, then the doubling time becomes... roughly every 2 years.
1 reply 0 retweets 1 like - 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.