AI doesn’t replicate. Having worked in the field, I can usually see why a paper’s result is nonsense, but the public can’t, and many researchers can’t.https://twitter.com/stephaniemlee/status/964612382650646529 …
-
Show this thread
-
I’ve been skeptical about DL results because 25 years ago I reran the key experiments that were hyped as showing backprop (the underlying tech) was incredible. In each case I found that the researchers were fooling themselves. Not deliberate fraud, but sloppy work.
3 replies 2 retweets 18 likesShow this thread -
-
Replying to @xuenay
The only one I wrote up was this one. The then-most-hyped version of RL+backprop turned out to work less well than RL+perceptron. https://www.ijcai.org/Proceedings/91-2/Papers/018.pdf …
1 reply 0 retweets 2 likes -
Replying to @Meaningness @xuenay
The one that got me really annoyed was XOR. The narrative was that Minsky&Papert unfairly killed perceptrons with that, and you could learn XOR if you added hidden layers. 1/
1 reply 0 retweets 1 like -
Replying to @Meaningness @xuenay
This turned out not to be true in any interesting sense. You can compute XOR with a feedforward network, but backprop won’t learn it reliably, nor in a reasonable length of time. It has to get lucky. 2/
2 replies 0 retweets 2 likes -
Replying to @Meaningness @xuenay
My recollection is that other people figured this out a few years later and that mostly killed off backprop research until ~2012. My memory of the details is vague however. There were a few others, but XOR and RL were the ones that seemed most significant. 3/3
1 reply 0 retweets 1 like
There’s a new paper that came out just a couple days ago that’s making the rounds called “Deep Reinforcement Learning Doesn’t Work” or something like that. I haven’t had time to look at it, but it didn’t work in 1992 so I’m not surprised.
-
-
Replying to @Meaningness @xuenay
David Chapman Retweeted Adam Strandberg
David Chapman added,
Adam Strandberg @The_LagrangianDeep reinforcement learning: - is usually worse than other approaches - requires difficult reward function definition (and hides magic) - finds pathological solutions (reward hacking, overfitting, local optima) - takes forever to train when it does work https://www.alexirpan.com/2018/02/14/rl-hard.html …0 replies 1 retweet 4 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.