default search action
Tom Everitt
Person information
- affiliation: Australian National University
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c26]Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael J. Wooldridge:
Reasoning about Causality in Games (Abstract Reprint). AAAI 2024: 22697 - [c25]Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt:
Discovering Agents (Abstract Reprint). AAAI 2024: 22701 - [c24]Francis Rhys Ward, Matt MacDermott, Francesco Belardinelli, Francesca Toni, Tom Everitt:
The Reasons that Agents Act: Intention and Instrumental Goals. AAMAS 2024: 1901-1909 - [c23]Jonathan Richens, Tom Everitt:
Robust agents learn causal world models. ICLR 2024 - [i34]Francis Rhys Ward, Matt MacDermott, Francesco Belardinelli, Francesca Toni, Tom Everitt:
The Reasons that Agents Act: Intention and Instrumental Goals. CoRR abs/2402.07221 (2024) - [i33]Jonathan Richens, Tom Everitt:
Robust agents learn causal world models. CoRR abs/2402.10877 (2024) - [i32]Seliem El-Sayed, Canfer Akbulut, Amanda McCroskery, Geoff Keeling, Zachary Kenton, Zaria Jalan, Nahema Marchal, Arianna Manzini, Toby Shevlane, Shannon Vallor, Daniel Susser, Matija Franklin, Sophie Bridgers, Harry Law, Matthew Rahtz, Murray Shanahan, Michael Henry Tessler, Arthur Douillard, Tom Everitt, Sasha Brown:
A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI. CoRR abs/2404.15058 (2024) - 2023
- [j3]Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael J. Wooldridge:
Reasoning about causality in games. Artif. Intell. 320: 103919 (2023) - [j2]Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt:
Discovering agents. Artif. Intell. 322: 103963 (2023) - [c22]Francis Ward, Francesca Toni, Francesco Belardinelli, Tom Everitt:
Honesty Is the Best Policy: Defining and Mitigating AI Deception. NeurIPS 2023 - [c21]Ryan Carey, Tom Everitt:
Human Control: Definitions and Algorithms. UAI 2023: 271-281 - [i31]Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael J. Wooldridge:
Reasoning about Causality in Games. CoRR abs/2301.02324 (2023) - [i30]Ryan Carey, Tom Everitt:
Human Control: Definitions and Algorithms. CoRR abs/2305.19861 (2023) - [i29]Matt MacDermott, Tom Everitt, Francesco Belardinelli:
Characterising Decision Theories with Mechanised Causal Graphs. CoRR abs/2307.10987 (2023) - [i28]Francis Rhys Ward, Francesco Belardinelli, Francesca Toni, Tom Everitt:
Honesty Is the Best Policy: Defining and Mitigating AI Deception. CoRR abs/2312.01350 (2023) - 2022
- [c20]Carolyn Ashurst, Ryan Carey, Silvia Chiappa, Tom Everitt:
Why Fair Labels Can Yield Unfair Predictions: Graphical Conditions for Introduced Unfairness. AAAI 2022: 9494-9503 - [c19]Sebastian Farquhar, Ryan Carey, Tom Everitt:
Path-Specific Objectives for Safer Agent Incentives. AAAI 2022: 9529-9538 - [c18]Chris van Merwijk, Ryan Carey, Tom Everitt:
A Complete Criterion for Value of Information in Soluble Influence Diagrams. AAAI 2022: 10034-10041 - [i27]Carolyn Ashurst, Ryan Carey, Silvia Chiappa, Tom Everitt:
Why Fair Labels Can Yield Unfair Predictions: Graphical Conditions for Introduced Unfairness. CoRR abs/2202.10816 (2022) - [i26]Chris van Merwijk, Ryan Carey, Tom Everitt:
A Complete Criterion for Value of Information in Soluble Influence Diagrams. CoRR abs/2202.11629 (2022) - [i25]Sebastian Farquhar, Ryan Carey, Tom Everitt:
Path-Specific Objectives for Safer Agent Incentives. CoRR abs/2204.10018 (2022) - [i24]Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt:
Discovering Agents. CoRR abs/2208.08345 (2022) - 2021
- [j1]Tom Everitt, Marcus Hutter, Ramana Kumar, Victoria Krakovna:
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective. Synth. 198(27): 6435-6467 (2021) - [c17]Tom Everitt, Ryan Carey, Eric D. Langlois, Pedro A. Ortega, Shane Legg:
Agent Incentives: A Causal Perspective. AAAI 2021: 11487-11495 - [c16]Eric D. Langlois, Tom Everitt:
How RL Agents Behave When Their Actions Are Modified. AAAI 2021: 11586-11594 - [c15]Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael J. Wooldridge:
Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice. AAMAS 2021: 574-582 - [c14]James Fox, Tom Everitt, Ryan Carey, Eric D. Langlois, Alessandro Abate, Michael J. Wooldridge:
PyCID: A Python Library for Causal Influence Diagrams. SciPy 2021: 65-73 - [i23]Tom Everitt, Ryan Carey, Eric D. Langlois, Pedro A. Ortega, Shane Legg:
Agent Incentives: A Causal Perspective. CoRR abs/2102.01685 (2021) - [i22]Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael J. Wooldridge:
Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice. CoRR abs/2102.05008 (2021) - [i21]Eric D. Langlois, Tom Everitt:
How RL Agents Behave When Their Actions Are Modified. CoRR abs/2102.07716 (2021) - [i20]Zachary Kenton, Tom Everitt, Laura Weidinger, Iason Gabriel, Vladimir Mikulik, Geoffrey Irving:
Alignment of Language Agents. CoRR abs/2103.14659 (2021) - [i19]Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Pérolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott E. Reed, Marcus Hutter, Nando de Freitas, Shane Legg:
Shaking the foundations: delusions in sequence models for interaction and control. CoRR abs/2110.10819 (2021) - 2020
- [i18]Ryan Carey, Eric D. Langlois, Tom Everitt, Shane Legg:
The Incentives that Shape Behaviour. CoRR abs/2001.07118 (2020) - [i17]Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg:
REALab: An Embedded Perspective on Tampering. CoRR abs/2011.08820 (2020) - [i16]Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg:
Avoiding Tampering Incentives in Deep RL via Decoupled Approval. CoRR abs/2011.08827 (2020)
2010 – 2019
- 2019
- [c13]Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg:
Modeling AGI Safety Frameworks with Causal Influence Diagrams. AISafety@IJCAI 2019 - [i15]Tom Everitt, Pedro A. Ortega, Elizabeth Barnes, Shane Legg:
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings. CoRR abs/1902.09980 (2019) - [i14]Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg:
Modeling AGI Safety Frameworks with Causal Influence Diagrams. CoRR abs/1906.08663 (2019) - [i13]Tom Everitt, Marcus Hutter:
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective. CoRR abs/1908.04734 (2019) - 2018
- [c12]Tom Everitt, Gary Lea, Marcus Hutter:
AGI Safety Literature Review. IJCAI 2018: 5441-5449 - [i12]Tom Everitt, Gary Lea, Marcus Hutter:
AGI Safety Literature Review. CoRR abs/1805.01109 (2018) - [i11]Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg:
Scalable agent alignment via reward modeling: a research direction. CoRR abs/1811.07871 (2018) - 2017
- [c11]Tobias Wängberg, Mikael Böörs, Elliot Catt, Tom Everitt, Marcus Hutter:
A Game-Theoretic Analysis of the Off-Switch Game. AGI 2017: 167-177 - [c10]Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter:
Count-Based Exploration in Feature Space for Reinforcement Learning. IJCAI 2017: 2471-2478 - [c9]Tom Everitt, Victoria Krakovna, Laurent Orseau, Shane Legg:
Reinforcement Learning with a Corrupted Reward Channel. IJCAI 2017: 4705-4713 - [e1]Tom Everitt, Ben Goertzel, Alexey Potapov:
Artificial General Intelligence - 10th International Conference, AGI 2017, Melbourne, VIC, Australia, August 15-18, 2017, Proceedings. Lecture Notes in Computer Science 10414, Springer 2017, ISBN 978-3-319-63702-0 [contents] - [i10]Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg:
Reinforcement Learning with a Corrupted Reward Channel. CoRR abs/1705.08417 (2017) - [i9]Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter:
Count-Based Exploration in Feature Space for Reinforcement Learning. CoRR abs/1706.08090 (2017) - [i8]Tobias Wängberg, Mikael Böörs, Elliot Catt, Tom Everitt, Marcus Hutter:
A Game-Theoretic Analysis of the Off-Switch Game. CoRR abs/1708.03871 (2017) - [i7]Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg:
AI Safety Gridworlds. CoRR abs/1711.09883 (2017) - 2016
- [c8]Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter:
Self-Modification of Policy and Utility Function in Rational Agents. AGI 2016: 1-11 - [c7]Tom Everitt, Marcus Hutter:
Avoiding Wireheading with Value Reinforcement Learning. AGI 2016: 12-22 - [c6]Jarryd Martin, Tom Everitt, Marcus Hutter:
Death and Suicide in Universal Artificial Intelligence. AGI 2016: 23-32 - [i6]Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter:
Self-Modification of Policy and Utility Function in Rational Agents. CoRR abs/1605.03142 (2016) - [i5]Tom Everitt, Marcus Hutter:
Avoiding Wireheading with Value Reinforcement Learning. CoRR abs/1605.03143 (2016) - [i4]Jarryd Martin, Tom Everitt, Marcus Hutter:
Death and Suicide in Universal Artificial Intelligence. CoRR abs/1606.00652 (2016) - [i3]Tom Everitt, Tor Lattimore, Marcus Hutter:
Free Lunch for Optimisation under the Universal Distribution. CoRR abs/1608.04544 (2016) - 2015
- [c5]Tom Everitt, Jan Leike, Marcus Hutter:
Sequential Extensions of Causal and Evidential Decision Theory. ADT 2015: 205-221 - [c4]Tom Everitt, Marcus Hutter:
Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part I: Tree Search. Australasian Conference on Artificial Intelligence 2015: 157-165 - [c3]Tom Everitt, Marcus Hutter:
Analytical Results on the BFS vs. DFS Algorithm Selection Problem: Part II: Graph Search. Australasian Conference on Artificial Intelligence 2015: 166-178 - [i2]Tom Everitt, Jan Leike, Marcus Hutter:
Sequential Extensions of Causal and Evidential Decision Theory. CoRR abs/1506.07359 (2015) - [i1]Tom Everitt, Marcus Hutter:
A Topological Approach to Meta-heuristics: Analytical Results on the BFS vs. DFS Algorithm Selection Problem. CoRR abs/1509.02709 (2015) - 2014
- [c2]Tom Everitt, Tor Lattimore, Marcus Hutter:
Free Lunch for optimisation under the universal distribution. IEEE Congress on Evolutionary Computation 2014: 167-174 - [c1]Tansu Alpcan, Tom Everitt, Marcus Hutter:
Can we measure the difficulty of an optimization problem? ITW 2014: 356-360
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-06 22:06 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint