Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
Blokirali ste korisnika/cu @sidgreddy
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @sidgreddy
-
Sid Reddy proslijedio/la je Tweet
How do you train an RL agent in the presence of unknown, unsafe states without visiting them even once? New algorithm by our intern
@sidgreddy synthesizes trajectories with a generative model and ask a human to label them for safety.https://deepmind.com/blog/article/learning-human-objectives-by-evaluating-hypothetical-behaviours …Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Sid Reddy proslijedio/la je Tweet
We present ReQueST: a method for training RL agents from human feedback in the presence of unknown unsafe states. By
@sidgreddy,@ancadianadragan,@svlevine,@ShaneLegg,@janleike Paper: https://arxiv.org/abs/1912.05652 Code: https://github.com/rddy/ReQueST pic.twitter.com/qM8GSaMcr2
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Scaled Autonomy: Enabling Human Operators to Control Robot Fleets https://arxiv.org/abs/1910.02910 Imitation learning for mimicking expert actions, and imitation learning for deciding which robot the expert should teleop right now. w/
@g_k_swamy,@svlevine,@ancadianadraganHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Sid Reddy proslijedio/la je Tweet
SQIL: an imitation learning method so simple I can summarize in a tweet: drop demonstrations into buffer, set their reward to +1, set reward for all other data to 0, run Q-learning or SAC to train. Why does it work? Find out: https://arxiv.org/abs/1905.11108 w/
@sidgreddy@ancadianadraganpic.twitter.com/R2sY3H8m50
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Sid Reddy proslijedio/la je Tweet
Where do You Think You’re Going? New paper on inferring how humans believe an MDP (e.g., game) works https://sites.google.com/view/inferring-internal-dynamics … With @elbowproxy and A. Dragan
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Sid Reddy proslijedio/la je Tweet
"Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior," Reddy et al.: https://arxiv.org/abs/1805.08010
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Sid Reddy proslijedio/la je Tweet
Interested in semi-autonomous robots and human-machine centaur teams? Read @elbowproxy’s new BAIR blog post on augmenting human control with deep reinforcement learning for flexible human-robot shared autonomy!http://bair.berkeley.edu/blog/2018/04/18/shared-autonomy/ …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.