A new benchmark for human-level concept learning and reasoning. Humans beat #AI hands down! Shows gaps with current #DeepLearning meta/few-shot learning.
@NeurIPSConf @NVIDIAAI @wn8_nie @yukez @ZhidingYu @abp4_ankit
Blog: https://developer.nvidia.com/blog/building-a-benchmark-for-human-level-concept-learning-and-reasoning/ …
Paper: https://papers.nips.cc/paper/2020/file/bf15e9bbff22c7719020f9df4badc20a-Paper.pdf …
-
Show this thread
-
Bongard analogy challenge uses 6 positive and negative examples to convey a concept (e.g. convex). Original one from 1960s ~ few hundred hand-engineered problem instances. Instead, we use LOGO to programmatically generate concepts at scale => no data scarcity for
#DeepLearning1 reply 0 retweets 13 likesShow this thread -
Human evaluation accuracy on our benchmark was as high as 99% for careful evaluators. Yet, all current meta/few-shot/self-supervised methods < 70%. Even the best
#DeepLearning methods are not able to learn underlying simple concepts at human level.1 reply 0 retweets 18 likesShow this thread -
We carefully designed benchmark to not reward overfitting by
#DeepLearning methods. This is a problem with previous few-shot learning benchmarks. E.g. birds dataset uses very similar birds that differ only on simplistic attributes like color of beak.1 reply 0 retweets 11 likesShow this thread -
We designed our benchmark to reward qualities seen in human vision: (1)Context dependence (2)Analogy Making (3)Few-shot learning with infinite vocabulary. Simply overfitting and memorization by
#DeepLearning will not be enough to do well on our benchmark.1 reply 1 retweet 10 likesShow this thread -
Context dependence: Same shape will have different meanings in different contexts. In left figure, highlighted shape shares a different property with other 5 positive examples, compared to right. Hence, simple pattern matching, which
#DeepLearning is good at, WILL FAIL.pic.twitter.com/i8E0dOeydk
1 reply 1 retweet 14 likesShow this thread -
Analogy making or compositionality: Simple shapes compose together. E.g. small circles in highlighted figure are arranged together to form a "meta" shape. Humans are amazing at creating abstractions. We want to test this ability in
#AIpic.twitter.com/Lz4ouU4Y9a
1 reply 1 retweet 12 likesShow this thread -
Infinite vocabulary: Previous benchmarks are limited to finite categories, and it is easy for
#DeepLearning to memorize. To prevent this, we programmatically generate new concepts. We have subcategories: free-form, basic and abstract.pic.twitter.com/vVVi8rTCFR
1 reply 2 retweets 13 likesShow this thread
Our dataset looks so simple and toy-like, yet deeply challenging for #DeepLearning due to (1)context dependence (2)abstractions (3) infinite vocabulary. We resorted to synthetic data due to severe data imbalance and scarcity with real datasets for few-shot learning.
-
-
On our benchmark, a basic neuro-symbolic method beat all neural approaches consistently and significantly. Neural approaches include latest meta/few-shot/self-supervised approaches. Shows symbol grounding is fundamental to concept learning.
3 replies 4 retweets 44 likesShow this thread -
Project page for our
@NeurIPSConf Bongard-LOGO paper with all resources including code for dataset generation.@yukez@ZhidingYu@wn8_nie@abp4_ankit@NVIDIAAI https://research.nvidia.com/publication/2020-12_Bongard-LOGO …1 reply 2 retweets 18 likesShow this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.