


Остановите войну!
for scientists:
Shimon Whiteson
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [c134]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. AAAI 2022: 8378-8385 - [c133]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. AAMAS 2022: 1491-1499 - [c132]Darius Muglich, Luisa M. Zintgraf, Christian A. Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. ICML 2022: 16062-16082 - [c131]Samuel Sokota, Christian A. Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob N. Foerster:
Communicating via Markov Decision Processes. ICML 2022: 20314-20328 - [c130]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. ICRA 2022: 2445-2451 - [i90]Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar:
In Defense of the Unitary Scalarization for Deep Multi-Task Learning. CoRR abs/2201.04122 (2022) - [i89]Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson:
You May Not Need Ratio Clipping in PPO. CoRR abs/2202.00079 (2022) - [i88]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO. CoRR abs/2202.00082 (2022) - [i87]Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson:
Generalization in Cooperative Multi-Agent Systems. CoRR abs/2202.00104 (2022) - [i86]Maximilian Igl, Daewoo Kim, Alex Kuefler, Paul Mougin, Punit Shah, Kyriacos Shiarlis, Dragomir Anguelov, Mark Palatucci, Brandyn White, Shimon Whiteson:
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. CoRR abs/2205.03195 (2022) - [i85]Darius Muglich, Luisa M. Zintgraf, Christian Schröder de Witt, Shimon Whiteson, Jakob N. Foerster:
Generalized Beliefs for Cooperative AI. CoRR abs/2206.12765 (2022) - 2021
- [j28]Jacopo Castellini, Frans A. Oliehoek
, Rahul Savani, Shimon Whiteson:
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning. Auton. Agents Multi Agent Syst. 35(2): 25 (2021) - [j27]Luisa M. Zintgraf, Sebastian Schulze, Cong Lu, Leo Feng, Maximilian Igl, Kyriacos Shiarlis, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning. J. Mach. Learn. Res. 22: 289:1-289:39 (2021) - [j26]Dmitrii Beloborodov, Alexander E. Ulanov
, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization. Mach. Learn. Sci. Technol. 2(2): 25009 (2021) - [c129]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning. AAAI 2021: 10905-10913 - [c128]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. AAMAS 2021: 1712-1714 - [c127]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. AAMAS 2021: 1735-1737 - [c126]Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang:
RODE: Learning Roles to Decompose Multi-Agent Tasks. ICLR 2021 - [c125]Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson:
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning. ICLR 2021 - [c124]Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson:
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control. ICLR 2021 - [c123]Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning. ICML 2021: 3930-3941 - [c122]Shariq Iqbal, Christian A. Schröder de Witt, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha:
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning. ICML 2021: 4596-4606 - [c121]Anuj Mahajan, Mikayel Samvelyan, Lei Mao
, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. ICML 2021: 7301-7312 - [c120]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. ICML 2021: 12578-12588 - [c119]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. ICML 2021: 12621-12631 - [c118]Luisa M. Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson:
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning. ICML 2021: 12991-13001 - [c117]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning (Extended Abstract). IJCAI 2021: 4869-4873 - [c116]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Regularized Softmax Deep Multi-Agent Q-Learning. NeurIPS 2021: 1365-1377 - [c115]Bei Peng, Tabish Rashid, Christian Schröder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
FACMAC: Factored Multi-Agent Centralised Policy Gradients. NeurIPS 2021: 12208-12221 - [c114]Mattie Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. NeurIPS 2021: 13641-13656 - [c113]Charles Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing. NeurIPS 2021: 23983-23992 - [i84]Shangtong Zhang, Yi Wan, Richard S. Sutton, Shimon Whiteson:
Average-Reward Off-Policy Policy Evaluation with Function Approximation. CoRR abs/2101.02808 (2021) - [i83]Luisa M. Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann:
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning. CoRR abs/2101.03864 (2021) - [i82]Shangtong Zhang, Hengshuai Yao, Shimon Whiteson:
Breaking the Deadly Triad with a Target Network. CoRR abs/2101.08862 (2021) - [i81]Charlie Blake, Vitaly Kurin, Maximilian Igl, Shimon Whiteson:
Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing. CoRR abs/2103.01009 (2021) - [i80]Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson:
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning. CoRR abs/2103.11883 (2021) - [i79]Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson:
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients. CoRR abs/2104.13446 (2021) - [i78]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. CoRR abs/2106.00136 (2021) - [i77]Mingfei Sun, Anuj Mahajan, Katja Hofmann, Shimon Whiteson:
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching. CoRR abs/2106.03155 (2021) - [i76]Matthew Fellows, Kristian Hartikainen, Shimon Whiteson:
Bayesian Bellman Operators. CoRR abs/2106.05012 (2021) - [i75]Samuel Sokota, Christian Schröder de Witt, Maximilian Igl, Luisa M. Zintgraf, Philip H. S. Torr, Shimon Whiteson, Jakob N. Foerster:
Implicit Communication as Minimum Entropy Coupling. CoRR abs/2107.08295 (2021) - [i74]Shangtong Zhang, Shimon Whiteson:
Truncated Emphatic Temporal Difference Methods for Prediction and Control. CoRR abs/2108.05338 (2021) - [i73]Pascal Van Der Vaart, Anuj Mahajan, Shimon Whiteson:
Model based Multi-agent Reinforcement Learning with Tensor Decompositions. CoRR abs/2110.14524 (2021) - [i72]Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar:
Reinforcement Learning in Factored Action Spaces using Tensor Decompositions. CoRR abs/2110.14538 (2021) - [i71]Zheng Xiong, Luisa M. Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson:
On the Practical Consistency of Meta-Reinforcement Learning Algorithms. CoRR abs/2112.00478 (2021) - [i70]Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson:
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency. CoRR abs/2112.06054 (2021) - 2020
- [j25]Guangliang Li
, Hamdi Dibeklioglu
, Shimon Whiteson, Hayley Hung:
Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework. Auton. Agents Multi Agent Syst. 34(1): 22 (2020) - [j24]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients for Reinforcement Learning. J. Mach. Learn. Res. 21: 52:1-52:51 (2020) - [j23]Supratik Paul, Konstantinos I. Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson:
Robust Reinforcement Learning with Bayesian Optimisation and Quadrature. J. Mach. Learn. Res. 21: 151:1-151:31 (2020) - [j22]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. J. Mach. Learn. Res. 21: 178:1-178:51 (2020) - [c112]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards. AAMAS 2020: 1215-1223 - [c111]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning. AAMAS 2020: 1611-1619 - [c110]Tabish Rashid, Bei Peng, Wendelin Boehmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. ICLR 2020 - [c109]Luisa M. Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning. ICLR 2020 - [c108]Wendelin Boehmer, Vitaly Kurin, Shimon Whiteson:
Deep Coordination Graphs. ICML 2020: 980-991 - [c107]Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve:
Growing Action Spaces. ICML 2020: 3040-3051 - [c106]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. ICML 2020: 11194-11203 - [c105]Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson:
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation. ICML 2020: 11204-11213 - [c104]Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro:
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? NeurIPS 2020 - [c103]Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson:
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. NeurIPS 2020 - [c102]Shangtong Zhang, Vivek Veeriah, Shimon Whiteson:
Learning Retrospective Knowledge with Reverse Reinforcement Learning. NeurIPS 2020 - [c101]Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Boehmer, Shimon Whiteson:
Multitask Soft Option Learning. UAI 2020: 969-978 - [i69]Guangliang Li, Hamdi Dibeklioglu, Shimon Whiteson, Hayley Hung:
Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework. CoRR abs/2001.08703 (2020) - [i68]Shangtong Zhang, Bo Liu, Shimon Whiteson:
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values. CoRR abs/2001.11113 (2020) - [i67]Dmitrii Beloborodov, Alexander E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky:
Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization. CoRR abs/2002.04676 (2020) - [i66]Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson:
Optimistic Exploration even with a Pessimistic Initialisation. CoRR abs/2002.12174 (2020) - [i65]Christian Schröder de Witt, Bei Peng, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson:
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control. CoRR abs/2003.06709 (2020) - [i64]Tabish Rashid, Mikayel Samvelyan, Christian Schröder de Witt, Gregory Farquhar, Jakob N. Foerster, Shimon Whiteson:
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. CoRR abs/2003.08839 (2020) - [i63]Shangtong Zhang, Bo Liu, Shimon Whiteson:
Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning. CoRR abs/2004.10888 (2020) - [i62]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Reward. CoRR abs/2005.04912 (2020) - [i61]Pierre-Alexandre Kamienny, Kai Arulkumaran, Feryal Behbahani, Wendelin Boehmer, Shimon Whiteson:
Privileged Information Dropout in Reinforcement Learning. CoRR abs/2005.09220 (2020) - [i60]Shariq Iqbal, Christian A. Schröder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha:
AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning. CoRR abs/2006.04222 (2020) - [i59]Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson:
The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning. CoRR abs/2006.05826 (2020) - [i58]Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson:
Weighted QMIX: Expanding Monotonic Value Function Factorisation. CoRR abs/2006.10800 (2020) - [i57]Shangtong Zhang, Vivek Veeriah, Shimon Whiteson:
Learning Retrospective Knowledge with Reverse Reinforcement Learning. CoRR abs/2007.06703 (2020) - [i56]Minqi Jiang, Jelena Luketina, Nantas Nardelli, Pasquale Minervini, Philip H. S. Torr, Shimon Whiteson, Tim Rocktäschel:
WordCraft: An Environment for Benchmarking Commonsense Agents. CoRR abs/2007.09185 (2020) - [i55]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan:
Exploiting Submodular Value Functions For Scaling Up Active Perception. CoRR abs/2009.09696 (2020) - [i54]Luisa M. Zintgraf, Leo Feng, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson:
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning. CoRR abs/2010.01062 (2020) - [i53]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. CoRR abs/2010.01069 (2020) - [i52]Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang:
RODE: Learning Roles to Decompose Multi-Agent Tasks. CoRR abs/2010.01523 (2020) - [i51]Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson:
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control. CoRR abs/2010.01856 (2020) - [i50]Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson:
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning. CoRR abs/2010.02974 (2020) - [i49]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Henri Bouma:
Real-Time Resource Allocation for Tracking Systems. CoRR abs/2010.03024 (2020) - [i48]Christian Schröder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson:
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? CoRR abs/2011.09533 (2020)
2010 – 2019
- 2019
- [c100]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning. AAMAS 2019: 1862-1864 - [c99]Mikayel Samvelyan, Tabish Rashid, Christian Schröder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob N. Foerster, Shimon Whiteson:
The StarCraft Multi-Agent Challenge. AAMAS 2019: 2186-2188 - [c98]Alistair Letcher, Jakob N. Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson:
Stable Opponent Shaping in Differentiable Games. ICLR (Poster) 2019 - [c97]Jakob N. Foerster, H. Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew M. Botvinick, Michael Bowling:
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning. ICML 2019: 1942-1951 - [c96]Jingkai Mao, Jakob N. Foerster, Tim Rocktäschel, Maruan Al-Shedivat, Gregory Farquhar, Shimon Whiteson:
A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs. ICML 2019: 4343-4351 - [c95]Supratik Paul, Michael A. Osborne
, Shimon Whiteson:
Fingerprint Policy Optimisation for Robust Reinforcement Learning. ICML 2019: 5082-5091 - [c94]Luisa M. Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson:
Fast Context Adaptation via Meta-Learning. ICML 2019: 7693-7702 - [c93]Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek
, João V. Messias, Shimon Whiteson:
Learning From Demonstration in the Wild. ICRA 2019: 775-781 - [c92]Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob N. Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel:
A Survey of Reinforcement Learning Informed by Natural Language. IJCAI 2019: 6309-6317 - [c91]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Generalized Off-Policy Actor-Critic. NeurIPS 2019: 1999-2009 - [c90]Shangtong Zhang, Shimon Whiteson:
DAC: The Double Actor-Critic Architecture for Learning Options. NeurIPS 2019: 2010-2020 - [c89]Supratik Paul, Vitaly Kurin, Shimon Whiteson:
Fast Efficient Hyperparameter Tuning for Policy Gradient Methods. NeurIPS 2019: 4618-4628 - [c88]Matthew Fellows, Anuj Mahajan, Tim G. J. Rudner, Shimon Whiteson:
VIREL: A Variational Inference Framework for Reinforcement Learning. NeurIPS 2019: 7120-7134 - [c87]Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson:
MAVEN: Multi-Agent Variational Exploration. NeurIPS 2019: 7611-7622 - [c86]Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning. NeurIPS 2019: 8149-8160 - [c85]Christian Schröder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson:
Multi-Agent Common Knowledge Reinforcement Learning. NeurIPS 2019: 9924-9935 - [i47]Mikayel Samvelyan, Tabish Rashid, Christian Schröder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner
, Chia-Man Hung, Philip H. S. Torr, Jakob N. Foerster, Shimon Whiteson:
The StarCraft Multi-Agent Challenge. CoRR abs/1902.04043 (2019) - [i46]Supratik Paul, Vitaly Kurin, Shimon Whiteson:
Fast Efficient Hyperparameter Tuning for Policy Gradients. CoRR abs/1902.06583 (2019) - [i45]Jacopo Castellini, Frans A. Oliehoek, Rahul Savani, Shimon Whiteson:
The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning. CoRR abs/1902.07497 (2019) - [i44]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Generalized Off-Policy Actor-Critic. CoRR abs/1903.11329 (2019) - [i43]Maximilian Igl, Andrew Gambardella, Nantas Nardelli, N. Siddharth, Wendelin Böhmer, Shimon Whiteson:
Multitask Soft Option Learning. CoRR abs/1904.01033 (2019) - [i42]Shangtong Zhang, Shimon Whiteson:
DAC: The Double Actor-Critic Architecture for Learning Options. CoRR abs/1904.12691 (2019) - [i41]Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson:
Deep Residual Reinforcement Learning. CoRR abs/1905.01072 (2019) - [i40]Wendelin Böhmer, Tabish Rashid, Shimon Whiteson:
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning. CoRR abs/1906.02138 (2019) - [i39]Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob N. Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel:
A Survey of Reinforcement Learning Informed by Natural Language. CoRR abs/1906.03926 (2019) - [i38]Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve:
Growing Action Spaces. CoRR abs/1906.12266 (2019) - [i37]Gregory Farquhar, Shimon Whiteson, Jakob N. Foerster:
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning. CoRR abs/1909.10549 (2019) - [i36]Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro:
Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning. CoRR abs/1909.11830 (2019) - [i35]Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson:
Deep Coordination Graphs. CoRR abs/1910.00091 (2019) - [i34]Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson:
MAVEN: Multi-Agent Variational Exploration. CoRR abs/1910.07483 (2019) - [i33]Luisa M. Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson:
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning. CoRR abs/1910.08348 (2019) - [i32]Shangtong Zhang, Bo Liu, Hengshuai Yao, Shimon Whiteson:
Provably Convergent Off-Policy Actor-Critic with Function Approximation. CoRR abs/1911.04384 (2019) - [i31]Leo Feng, Luisa M. Zintgraf, Bei Peng, Shimon Whiteson:
VIABLE: Fast Adaptation via Backpropagating Learned Loss. CoRR abs/1911.13159 (2019) - 2018
- [j21]Guangliang Li
, Shimon Whiteson, W. Bradley Knox, Hayley Hung:
Social interaction for efficient agent learning from human reward. Auton. Agents Multi Agent Syst. 32(1): 1-25 (2018) - [j20]Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek
, Matthijs T. J. Spaan:
Exploiting submodular value functions for scaling up active perception. Auton. Robots 42(2): 209-233 (2018) - [c84]Kamil Ciosek, Shimon Whiteson:
Expected Policy Gradients. AAAI 2018: 2868-2875 - [c83]Jakob N. Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson:
Counterfactual Multi-Agent Policy Gradients. AAAI 2018: 2974-2982 - [c82]Supratik Paul, Konstantinos I. Chatzilygeroudis, Kamil Ciosek, Jean-Baptiste Mouret, Michael A. Osborne, Shimon Whiteson:
Alternating Optimisation and Quadrature for Robust Control. AAAI 2018: 3925-3933 - [c81]Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch:
Learning with Opponent-Learning Awareness. AAMAS 2018: 122-130 - [c80]