If you build the data set, I bet a neural network can learn the distribution.
-
-
-
life is not a Kaggle competition. my kids didn’t demand i give them IID data sets that matched the tests in the world. people will be free to use massive online corpora for myt benchmark, but the hand holding has to stop.
- 4 more replies
New conversation -
-
-
For someone not deeply involved with NLP, what is a world state change benchmark about? Or is it a more general "changing the language but keeping the underlying laws" idea, for zero or few shot learning?
-
when we read or hear a story or just look around, we build an internal cognitive model of what’s going on; we update that model as we learn more. eg we track who has done what to whom, when, where, & why language neural nets like GPT-2 don’t see http://rebooting.ai 4
- 4 more replies
New conversation -
-
-
ALBERT seems like the best implementation right now for pretrained language modelshttps://github.com/google-research/google-research/tree/master/albert …
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I wonder whether CTRL (https://arxiv.org/abs/1909.05858 ) could be appropriate; available on github https://github.com/salesforce/ctrl https://github.com/huggingface/transformers/releases/tag/v2.1.1 …
@StrongDuality@BMarcusMcCannThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Let’s say D1 is the distribution trained on the network we are transferring. Let’s suppose D2 is the new distribution we are trying to learn. My intuition tells me that D1 and D2 are a lot closer in images than in natural language processing. Why is this ? Sure let me explain
-
In the images case we are mostly dealing with static data, however languages work in a time series manner. So the distance between D1 and D2 is much more drastic. This could be a reason why transfer learning in NLP is a lot harder than the image case.
End of conversation
New conversation -
-
-
@HenaffMikael would knowThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Statistical language models will never model anything like intelligence. It’s to AGI what homeopathy is to medicine.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.