If you run a policy gradient method on a complex environment and each time you get a different result, it is not weird at all, in fact it is just normal. Please stop expecting them to do what they haven’t promised you, lol
#DeepRL #nonconvex #stochasticOptimization
Replying to @kazizzad
Yes, because it is nonconvex. That's why it is important to do many runs, which rarely happens in deep reinforcement learning #deepRL
10:18 PM - 25 Mar 2019
from Pasadena, CA
0 replies
1 retweet
13 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.