Look forward to that blog post. I'd add a 3rd variety who spend plenty of time designing and conducting replications and control experiments to make sure they are not mining noise before attempting to publish. I'm not saying you can't fool yourself with this strategy, but still.
-
-
Replying to @fedeadolfi @Sam_D_Parsons and
Gah how dull. Where’s the flair in that.
1 reply 0 retweets 0 likes -
Replying to @chrisdc77 @Sam_D_Parsons and
I know, what a bunch of losers.
1 reply 0 retweets 0 likes -
Replying to @fedeadolfi @chrisdc77 and
If you read my post it should be clear that I am NOT saying that what the field is doing now is fine. What I said is that standard practice is horrible, but that RR won't fix it. The first thing to do is to kill the idea that point-null hypothesis testing is sufficient.
1 reply 0 retweets 1 like -
Replying to @gallantlab @fedeadolfi and
RR seems to be primarily about reducing Type I error. But if you view PNHT as insufficient, merely a weak, poorly reasoned pretest of data quality, then it becomes obvious that the focus should be elsewhere. We need a revolution, not more paperwork.
4 replies 0 retweets 0 likes -
Replying to @gallantlab @fedeadolfi and
I don't get this at all. RRs just make it so that papers are accepted based on question and methods, and not based on results and post-hoc storytelling. I'm surprised you're not in favor of that. they don't add much (if any) work, and they're completely orthogonal to Q of NHST
1 reply 0 retweets 1 like -
Replying to @talyarkoni @gallantlab and
if anything, it should be much *easier* to criticize studies for using NHST when an RR is first submitted for review, than to wait until after the authors are happily trumpeting their p < .001 result and can say "but look, it's strong!"
1 reply 0 retweets 0 likes -
Replying to @talyarkoni @fedeadolfi and
The appropriate answer to "look my p < .001 is strong" is to explain the difference between significance and prediction out of set. RR could perversely give people cover to do what they shouldn't do to begin with (that is, to equate significance as effect size or prediction).
1 reply 0 retweets 0 likes -
Replying to @gallantlab @fedeadolfi and
but you can always do that after the fact either way (even for RR papers). the point is that under the RR model, at least some of the time, a reviewer like you would catch it *before* the paper is published!
2 replies 0 retweets 1 like -
Replying to @talyarkoni @fedeadolfi and
Again, my point is that we shouldn't even have to argue about this. If no one were allowed to publish direction hypotheses w/o the proper other stuff that is required to do real science (out of set quantitative predictions), this would be an irrelevant sideshow.
2 replies 0 retweets 0 likes
Replacing p values with out of sample prediction will just be shifting the problem from p-hacking to out-of-sample-hacking. My feeling is that many machine learning papers also do not replicate.
-
-
Replying to @bttyeo @talyarkoni and
Well I agree with that, because people will always try to game the system. But that is a better problem than we have now for multiple reasons (e.g. generalization advantages). This is precisely why contests are so popular in computer science. It minimizes cheating opportunities.
1 reply 0 retweets 2 likes -
Replying to @gallantlab @bttyeo and
Competitions are great and we should have more of them, but imagine that you would be mandated to create a competition for every single paper you publish. So much more "paperwork"!
0 replies 0 retweets 1 like
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.