Fair point, but it's worth noting that there is a trade-off between # seeds, # baselines, and # and complexity of environment(s). Personally, I'd prefer a method /w 3 seeds eval'd on 57 tasks with 5 baselines to one with 10 seeds eval'd on 1 task with 2 baselines.https://twitter.com/pyoudeyer/status/1222911050028285954 …
Tasks are not a straightforward replacement for seeds because seends can be randomised, and hand-made tasks are not. Unless you're procedurally generating the tasks, in which case you effectively have more seeds.
-
-
I agree, not a replacement -- but they do compete for computing resources. And generally I've found that it's easier to overfit to a single task than it is to be misled by too few seeds. Both are problematic, but without infinite compute hard decisions must be made.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Its more that seeds are not a replacement for tasks though, since the practical question we're trying to evaluate is 'will this algorithm be better on a new, unseen task?'. So we get more evidence of that when varying the task.
-
More of *weaker* evidence.
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.