"Reward learning from human preferences and demonstrations in Atari."

Borja Ibarz et al. (2018)
a service of Schloss Dagstuhl - Leibniz Center for Informatics