But if, in the toy example, the agent "knows all of the above facts", and the programmers know this, there wouldn't be any need for the training phase
-
-
-
Good question! You are correct that initially, the model won't know anything about the base optimizer. The critical question, therefore, is which will come first: modeling of the base optimizer or internalization of the base objective? We discuss this question in section 4.4.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.