The problem isn't just time variables. It's *any* "out of domain" data. So you need a good validation set. If you're not generalizing to the validation set, try deep learning instead, or even boosting a deep learning model with a random forest model.
-
-
Prikaži ovu nit
-
Also, GBMs can often do better on these kinds of problems. The winner of the competition discussed in the article found GBMs better than RFs, and factorization machines better still.pic.twitter.com/KjsanRXcyb
Prikaži ovu nit -
The linked post mentions neural nets as a good solution. Using neural nets on the Bulldozers dataset discussed in the post does indeed give significantly better results than random forests.
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
I was hoping that the article was going to cover a more complex time series, sometimes it's worth taking the hit on non-stationary series if the forecast horizon is sufficiently short and the algorithm is trained sufficiently frequently. E.g.https://colab.research.google.com/drive/1WJjty6Q87WZeqBZVy6--cgrnuW4p50PP …
-
The bulldozers dataset discussed is a very complex time series.
- Još 5 drugih odgovora
Novi razgovor -
-
-
Sometimes. But how would you do that in the bulldozers auction dataset example he discusses?
- Još 3 druga odgovora
Novi razgovor -
-
-
What about local linear random forests from wager et Al?
-
There's been many algorithms that combine linear and tree models over the years. None have really had much practical impact yet, although I like the idea. Local linear forests don't have a mature implementation yet, but hopefully it'll work out well
Kraj razgovora
Novi razgovor -
-
-
Intuitively stacking Random forest with Nueral Networks would work because one is good at interpolation and other at extrapolating...Never tried that though.
-
Yes exactly.
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.