More on the limits of curve fitting: I suspect the AI safety/alignment community overvalues the potential of inverse reinforcement learning. Mainly because we only approximate the expert's (human's) value function on the distribution of observed trajectories.
Replying to @zacharylipton
I never held hope for IRL, at least any versions I ever heard about.
10:47 AM - 23 May 2018
0 replies
0 retweets
3 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.