Facebook AI Research. (joint work with: ... Easy to evaluate – how good a deal did an agent get? Self-play ... RL+Rollouts: Train and decode to maximize reward ...
Lee mas