AI and Compute: Our analysis showing that the amount of compute used in the largest AI training runs has had a doubling period of 3.5 months since 2012 (net increase of 300,000x):https://blog.openai.com/ai-and-compute/
-
Show this thread
-
We've analyzed compute used in major AI results for the past decades and identified two eras in AI: 1) Prior to 2012 - AI results closely tracked Moore's Law, w/ compute doubling every two years. 2) Post-2012 - compute has been doubling every 3.4 months https://openai.com/blog/ai-and-compute/#addendum …pic.twitter.com/ILN5MRrWYH
24 replies 298 retweets 683 likesShow this thread -
Replying to @OpenAI
Where would Deep Blue and Watson fit on this chart? They are - surprisingly - missing.
1 reply 1 retweet 15 likes -
Replying to @Grady_Booch @OpenAI
We plotted Deep Blue on this early on in the analysis, but took it out of the chart as these results are ones which predominantly use ML. I'm not sure if there was enough info in the Watson papers to do a compute calc.
2 replies 0 retweets 2 likes -
Here is the deep blue number & method, hope helpful!pic.twitter.com/iQVqN4GKhW
1 reply 0 retweets 0 likes -
Yep, note that this was an initial cut at deep blue and makes some assumptions re: utilization and “thinking time”. As jack said, for this particular analysis we looked at learning systems in particular
1 reply 0 retweets 2 likes -
Replying to @girishsastry @jackclarkSF and
Where did the 11 GF / sec number come from? IIRC, the papers says they could evaluate ~200 mill positions per second, and were calculating ~8,000 features per position. That makes it at least 1 TF / sec, and probably at least an order of magnitude higher.
2 replies 0 retweets 0 likes -
Replying to @michael_nielsen @jackclarkSF and
It comes from the LINPACK benchmark in 1997: https://www.top500.org/site/48052
1 reply 0 retweets 1 like -
Replying to @girishsastry @michael_nielsen and
The quoted calculation does ignore the specialized chip for evaluation of positions though
1 reply 0 retweets 1 like
The number doesn't seem relevant then.
-
-
Replying to @michael_nielsen @jackclarkSF and
Yeah, sorry I was imprecise: I'm uncertain as to whether the LINPACK benchmark number exercises the use of the chess chips, but probably not? (IIRC I was going to look into this more, but we decided to focus on training models)
2 replies 0 retweets 0 likes -
Replying to @girishsastry @jackclarkSF and
Yes, the LINPACK benchmark seems in the right ballpark for what you'd get if you ignored the chess chips.
0 replies 0 retweets 0 likes
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.