We use tools from control: predictive form of LQG to update estimates of system parameters in epochs and combine it with online learning to get optimal regret bounds. Poster session tomorrow at 1pm PDT: https://wensun.github.io/rl_theory_workshop_2020_ICML.github.io/ …
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.