Discovery of the reward function for embodied reinforcement learning agents
PreliminariesNotationsThroughout the article, the following symbols are used. \({{{\mathcal{S}}}}\): state space. \({{{\mathcal{A}}}}\): action space. π: policy. θ: parameters of π....
