


Остановите войну!
for scientists:


default search action
Satinder Singh 0001
Satinder P. Singh – Satinder Pal Singh – Satinder (Baveja) Singh
Person information

- affiliation: DeepMind, London, UK
- affiliation: University of Michigan, Department of Electrical Engineering and Computer Science, Ann Arbor, MI, USA
- affiliation: Syntek Capital
- affiliation: AT&T Labs, Florham Park, NJ, USA
- affiliation: University of Colorado Boulder, Department of Computer Science, CO, USA
- affiliation: Massachusetts Institute of Technology (MIT), Brain and Cognitive Science Department, Cambridge, MA, USA
Other persons with the same name
- Satinder Singh — disambiguation page
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j30]Qi Zhang
, Edmund H. Durfee, Satinder Singh:
Risk-aware analysis for interpretations of probabilistic achievement and maintenance commitments. Artif. Intell. 317: 103864 (2023) - [i72]Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh:
Optimistic Meta-Gradients. CoRR abs/2301.03236 (2023) - [i71]Wilka Carvalho, Angelos Filos, Richard L. Lewis, Honglak Lee, Satinder Singh:
Composing Task Knowledge with Modular Successor Feature Approximators. CoRR abs/2301.12305 (2023) - [i70]Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy:
ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs. CoRR abs/2302.01275 (2023) - [i69]Bernardo Ávila Pires, Feryal M. P. Behbahani, Hubert Soyer, Kyriacos Nikiforou, Thomas Keck, Satinder Singh:
Hierarchical Reinforcement Learning in Complex 3D Environments. CoRR abs/2302.14451 (2023) - [i68]Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob N. Foerster, Satinder Singh, Feryal M. P. Behbahani:
Structured State Space Models for In-Context Reinforcement Learning. CoRR abs/2303.03982 (2023) - 2022
- [c187]Zeyu Zheng, Risto Vuorio, Richard L. Lewis, Satinder Singh:
Adaptive Pairwise Weights for Temporal Credit Assignment. AAAI 2022: 9225-9232 - [c186]Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh:
Meta-Gradients in Non-Stationary Environments. CoLLAs 2022: 886-901 - [c185]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. ICLR 2022 - [c184]David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh:
On the Expressivity of Markov Reward (Extended Abstract). IJCAI 2022: 5254-5258 - [i67]Vivek Veeriah, Zeyu Zheng, Richard L. Lewis, Satinder Singh:
GrASP: Gradient-Based Affordance Selection for Planning. CoRR abs/2202.04772 (2022) - [i66]Tom Zahavy, Yannick Schroecker, Feryal M. P. Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh:
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality. CoRR abs/2205.13521 (2022) - [i65]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - [i64]Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh:
Meta-Gradients in Non-Stationary Environments. CoRR abs/2209.06159 (2022) - [i63]Ethan A. Brooks, Logan Walls, Richard L. Lewis, Satinder Singh:
In-Context Policy Iteration. CoRR abs/2210.03821 (2022) - [i62]Hao Liu, Tom Zahavy, Volodymyr Mnih, Satinder Singh:
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining. CoRR abs/2210.10913 (2022) - [i61]Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan A. Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih:
In-context Reinforcement Learning with Algorithm Distillation. CoRR abs/2210.14215 (2022) - [i60]Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag:
Discovering Evolution Strategies via Meta-Black-Box Optimization. CoRR abs/2211.11260 (2022) - [i59]Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy:
POMRL: No-Regret Learning-to-Plan with Increasing Horizons. CoRR abs/2212.14530 (2022) - 2021
- [j29]David Silver
, Satinder Singh, Doina Precup, Richard S. Sutton
:
Reward is enough. Artif. Intell. 299: 103535 (2021) - [c183]Qi Zhang, Edmund H. Durfee, Satinder Singh:
Efficient Querying for Cooperative Probabilistic Commitments. AAAI 2021: 11378-11386 - [c182]Tom Zahavy, André Barreto, Daniel J. Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh:
Discovering a set of policies for the worst case reward. ICLR 2021 - [c181]Ethan A. Brooks, Janarthanan Rajendran, Richard L. Lewis, Satinder Singh:
Reinforcement Learning of Implicit and Explicit Control Flow Instructions. ICML 2021: 1082-1091 - [c180]Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh:
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment. IJCAI 2021: 2219-2226 - [c179]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. NeurIPS 2021: 7773-7786 - [c178]David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh:
On the Expressivity of Markov Reward. NeurIPS 2021: 7799-7812 - [c177]Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard L. Lewis, Satinder Singh:
Learning State Representations from Random Deep Action-conditional Predictions. NeurIPS 2021: 23679-23691 - [c176]Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh:
Reward is enough for convex MDPs. NeurIPS 2021: 25746-25759 - [c175]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. NeurIPS 2021: 29861-29873 - [i58]Tom Zahavy, André Barreto, Daniel J. Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh:
Discovering a set of policies for the worst case reward. CoRR abs/2102.04323 (2021) - [i57]Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard L. Lewis, Satinder Singh:
Learning State Representations from Random Deep Action-conditional Predictions. CoRR abs/2102.04897 (2021) - [i56]Zeyu Zheng, Risto Vuorio, Richard L. Lewis, Satinder Singh:
Pairwise Weights for Temporal Credit Assignment. CoRR abs/2102.04999 (2021) - [i55]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. CoRR abs/2102.06741 (2021) - [i54]Ethan A. Brooks, Janarthanan Rajendran, Richard L. Lewis, Satinder Singh:
Reinforcement Learning of Implicit and Explicit Control Flow in Instructions. CoRR abs/2102.13195 (2021) - [i53]Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh:
Reward is enough for convex MDPs. CoRR abs/2106.00661 (2021) - [i52]Tom Zahavy, Brendan O'Donoghue, André Barreto, Volodymyr Mnih, Sebastian Flennerhag, Satinder Singh:
Discovering Diverse Nearly Optimal Policies withSuccessor Features. CoRR abs/2106.00669 (2021) - [i51]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. CoRR abs/2106.10316 (2021) - [i50]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. CoRR abs/2109.04504 (2021) - [i49]Janarthanan Rajendran, Jonathan K. Kummerfeld, Satinder Singh:
Learning to Learn End-to-End Goal-Oriented Dialog From Related Dialog Tasks. CoRR abs/2110.15724 (2021) - [i48]David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh:
On the Expressivity of Markov Reward. CoRR abs/2111.00876 (2021) - 2020
- [j28]Qi Zhang
, Edmund H. Durfee
, Satinder Singh:
Semantics and algorithms for trustworthy commitment achievement under model uncertainty. Auton. Agents Multi Agent Syst. 34(1): 19 (2020) - [c174]Shun Zhang, Edmund H. Durfee, Satinder Singh:
Querying to Find a Safe Policy under Uncertain Safety Constraints in Markov Decision Processes. AAAI 2020: 2552-2559 - [c173]Janarthanan Rajendran, Richard L. Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh:
How Should an Agent Practice? AAAI 2020: 5454-5461 - [c172]Qi Zhang, Edmund H. Durfee, Satinder Singh:
Modeling Probabilistic Commitments for Maintenance Is Inherently Harder than for Achievement. AAAI 2020: 10326-10333 - [c171]Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh:
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles. AISTATS 2020: 2010-2020 - [c170]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c169]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? ICML 2020: 11436-11446 - [c168]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian M. Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. NeurIPS 2020 - [c167]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. NeurIPS 2020 - [c166]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. NeurIPS 2020 - [c165]Zheng Wen, Doina Precup, Morteza Ibrahimi, André Barreto, Benjamin Van Roy, Satinder Singh:
On Efficiency in Hierarchical Reinforcement Learning. NeurIPS 2020 - [c164]Zhongwen Xu, Hado Philip van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. NeurIPS 2020 - [c163]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
A Self-Tuning Actor-Critic Algorithm. NeurIPS 2020 - [i47]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Self-Tuning Deep Reinforcement Learning. CoRR abs/2002.12928 (2020) - [i46]Thomas W. Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian M. Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach:
Learning to Play No-Press Diplomacy with Best Response Policy Iteration. CoRR abs/2006.04635 (2020) - [i45]Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. CoRR abs/2007.08433 (2020) - [i44]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. CoRR abs/2007.08794 (2020) - [i43]Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh:
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments. CoRR abs/2010.15195 (2020) - [i42]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. CoRR abs/2011.03506 (2020) - [i41]Qi Zhang, Edmund H. Durfee, Satinder Singh:
Efficient Querying for Cooperative Probabilistic Commitments. CoRR abs/2012.07195 (2020)
2010 – 2019
- 2019
- [c162]Qi Zhang, Richard L. Lewis, Satinder Singh, Edmund H. Durfee:
Learning to Communicate and Solve Visual Blocks-World Tasks. AAAI 2019: 5781-5788 - [c161]John Holler, Risto Vuorio, Zhiwei (Tony) Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye:
Deep Reinforcement Learning for Multi-driver Vehicle Dispatching and Repositioning Problem. ICDM 2019: 1090-1095 - [c160]Qi Zhang, Edmund H. Durfee, Satinder Singh:
Computational Strategies for the Trustworthy Pursuit and the Safe Modeling of Probabilistic Maintenance Commitments. AISafety@IJCAI 2019 - [c159]Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Joelle Pineau, Satinder Singh, Aaron C. Courville:
No-Press Diplomacy: Modeling Multi-Agent Gameplay. NeurIPS 2019: 4476-4487 - [c158]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. NeurIPS 2019: 9306-9317 - [c157]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. NeurIPS 2019: 12467-12476 - [c156]Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh, Lazaros Polymenakos:
NE-Table: A Neural key-value table for Named Entities. RANLP 2019: 980-993 - [p1]Benjamin W. Priest, George Cybenko, Satinder Singh, Massimiliano Albanese
, Peng Liu:
Online and Scalable Adaptive Cyber Defense. Adversarial and Uncertain Reasoning for Adaptive Cyber Defense 2019: 232-261 - [i40]Christopher Grimm, Satinder Singh:
Learning Independently-Obtainable Reward Functions. CoRR abs/1901.08649 (2019) - [i39]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - [i38]Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, Aaron C. Courville:
No Press Diplomacy: Modeling Multi-Agent Gameplay. CoRR abs/1909.02128 (2019) - [i37]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard L. Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. CoRR abs/1909.04607 (2019) - [i36]Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh:
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles. CoRR abs/1910.10597 (2019) - [i35]Victor Bapst, Alvaro Sanchez-Gonzalez, Omar Shams, Kimberly L. Stachenfeld, Peter W. Battaglia, Satinder Singh, Jessica B. Hamrick:
Object-oriented state editing for HRL. CoRR abs/1910.14361 (2019) - [i34]Christopher Grimm, Irina Higgins, André Barreto, Denis Teplyashin, Markus Wulfmeier, Tim Hertweck, Raia Hadsell, Satinder Singh:
Disentangled Cumulants Help Successor Representations Transfer to New Tasks. CoRR abs/1911.10866 (2019) - [i33]John Holler, Risto Vuorio, Zhiwei (Tony) Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye:
Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem. CoRR abs/1911.11260 (2019) - [i32]Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Rémi Munos:
Hindsight Credit Assignment. CoRR abs/1912.02503 (2019) - [i31]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? CoRR abs/1912.05500 (2019) - [i30]Janarthanan Rajendran, Richard L. Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh:
How Should an Agent Practice? CoRR abs/1912.07045 (2019) - 2018
- [j27]Thanh Hai Nguyen
, Mason Wright, Michael P. Wellman
, Satinder Singh:
Multistage Attack Graph Security Games: Heuristic Strategies, with Empirical Game-Theoretic Analysis. Secur. Commun. Networks 2018: 2864873:1-2864873:28 (2018) - [c155]Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari:
Markov Decision Processes with Continuous Side Information. ALT 2018: 597-618 - [c154]Qi Zhang, Edmund H. Durfee, Satinder Singh:
Challenges in the Trustworthy Pursuit of Maintenance Commitments Under Uncertainty. TRUST@AAMAS 2018: 75-86 - [c153]Shun Zhang, Edmund H. Durfee, Satinder Singh:
On Querying for Safe Optimality in Factored Markov Decision Processes. AAMAS 2018: 2168-2170 - [c152]Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, Lazaros Polymenakos:
Learning End-to-End Goal-Oriented Dialog with Multiple Answers. EMNLP 2018: 3834-3843 - [c151]Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee:
Self-Imitation Learning. ICML 2018: 3875-3884 - [c150]Shun Zhang, Edmund H. Durfee
, Satinder Singh:
Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes. IJCAI 2018: 4867-4873 - [c149]Nan Jiang, Alex Kulesza, Satinder Singh:
Completing State Representations using Spectral Learning. NeurIPS 2018: 4333-4342 - [c148]Zeyu Zheng, Junhyuk Oh, Satinder Singh:
On Learning Intrinsic Rewards for Policy Gradient Methods. NeurIPS 2018: 4649-4659 - [i29]Jiaxuan Wang, Ian Fox, Jonathan Skaza, Nick Linck, Satinder Singh, Jenna Wiens:
The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA. CoRR abs/1803.02940 (2018) - [i28]Zeyu Zheng, Junhyuk Oh, Satinder Singh:
On Learning Intrinsic Rewards for Policy Gradient Methods. CoRR abs/1804.06459 (2018) - [i27]Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh:
Named Entities troubling your Neural Methods? Build NE-Table: A neural approach for handling Named Entities. CoRR abs/1804.09540 (2018) - [i26]Junhyuk Oh, Yijie Guo, Satinder Singh, Honglak Lee:
Self-Imitation Learning. CoRR abs/1806.05635 (2018) - [i25]Vivek Veeriah, Junhyuk Oh, Satinder Singh:
Many-Goals Reinforcement Learning. CoRR abs/1806.09605 (2018) - [i24]Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, Lazaros Polymenakos:
Learning End-to-End Goal-Oriented Dialog with Multiple Answers. CoRR abs/1808.09996 (2018) - [i23]Yijie Guo, Junhyuk Oh, Satinder Singh, Honglak Lee:
Generative Adversarial Self-Imitation Learning. CoRR abs/1812.00950 (2018) - 2017
- [c147]Thanh Hong Nguyen, Michael P. Wellman, Satinder Singh:
A Stackelberg Game Model for Botnet Traffic Exfiltration. AAAI Workshops 2017 - [c146]Verónica Pérez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence C. An:
Understanding and Predicting Empathic Behavior in Counseling Therapy. ACL (1) 2017: 1426-1435 - [c145]Shun Zhang, Edmund H. Durfee, Satinder Singh:
Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes. ICAPS 2017: 339-347 - [c144]Qi Zhang, Satinder Singh, Edmund H. Durfee:
Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making. ICAPS 2017: 348-357 - [c143]Thanh Hai Nguyen, Mason Wright, Michael P. Wellman
, Satinder Singh:
Multi-Stage Attack Graph Security Games: Heuristic Strategies, with Empirical Game-Theoretic Analysis. MTD@CCS 2017: 87-97 - [c142]Verónica Pérez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence C. An, Kathy J. Goggin, Delwyn Catley:
Predicting Counselor Behaviors in Motivational Interviewing Encounters. EACL (1) 2017: 1128-1137 - [c141]Thanh Hai Nguyen, Michael P. Wellman
, Satinder Singh:
A Stackelberg Game Model for Botnet Data Exfiltration. GameSec 2017: 151-170 - [c140]Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerry Tesauro, Satinder Singh:
Learning to Query, Reason, and Answer Questions On Ambiguous Texts. ICLR (Poster) 2017 - [c139]Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli:
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning. ICML 2017: 2661-2670 - [c138]Kareem Amin, Nan Jiang, Satinder Singh:
Repeated Inverse Reinforcement Learning. NIPS 2017: 1815-1824 - [c137]Junhyuk Oh, Satinder Singh, Honglak Lee:
Value Prediction Network. NIPS 2017: 6118-6128 - [e1]Satinder Singh, Shaul Markovitch:
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA. AAAI Press 2017 [contents] - [i22]Qi Zhang, Satinder Singh, Edmund H. Durfee:
Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making. CoRR abs/1703.04587 (2017) - [i21]Kareem Amin, Nan Jiang, Satinder Singh:
Repeated Inverse Reinforcement Learning. CoRR abs/1705.05427 (2017) - [i20]Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli:
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning. CoRR abs/1706.05064 (2017) - [i19]Junhyuk Oh, Satinder Singh, Honglak Lee:
Value Prediction Network. CoRR abs/1707.03497 (2017) - [i18]Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari:
Markov Decision Processes with Continuous Side Information. CoRR abs/1711.05726 (2017) - 2016
- [j26]Alexander Van Esbroeck
, Landon Smith, Zeeshan Syed, Satinder Singh, Zahi N. Karam:
Multi-task seizure detection: addressing intra-patient variation in seizure morphologies. Mach. Learn. 102(3): 309-321 (2016) - [c136]Nan Jiang, Alex Kulesza, Satinder Singh:
Improving Predictive State Representations via Gradient Descent. AAAI 2016: 1709-1715 - [c135]Edmund H. Durfee
, Satinder Singh:
On the Trustworthy Fulfillment of Commitments. AAMAS Workshops (Selected Papers) 2016: 1-13 - [c134]Edmund H. Durfee, Satinder Singh:
On the Trustworthy Fulfillment of Commitments. TRUST@AAMAS 2016: 54-62 - [c133]Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, Honglak Lee:
Control of Memory, Active Perception, and Action in Minecraft. ICML 2016: 2790-2799 - [c132]Xiaoxiao Guo, Satinder Singh, Richard L. Lewis, Honglak Lee:
Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games. IJCAI 2016: 1519-1525 - [c131]