Machine learning is too often perceived as a kind of oracle-like power capable of predicting the unknowable and working out of distribution.
-
-
What makes his proposal dubious is the notion of training on articles from past decades, which means the classifier would suffer badly from concept drift. The proper setup is actually to create a training dataset by asking people to rate *current* articles as evergreen or not.
-
But the “proper setup” would be using a less useful target variable (i.e., article is predicted to be evergreen rather than article actually withstood the test of time). So there’s a tradeoff between these two possible setups, right?
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.