default search action
Martha White
Person information
- affiliation: University of Alberta, Edmonton, Canada
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j21]Brett Daley, Marlos C. Machado, Martha White:
Demystifying the Recency Heuristic in Temporal-Difference Learning. RLJ 3: 1019-1036 (2024) - [j20]Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White:
Investigating the Interplay of Prioritized Replay and Generalization. RLJ 5: 2041-2058 (2024) - [j19]Andrew Patterson, Samuel Neumann, Raksha Kumaraswamy, Martha White, Adam White:
Cross-environment Hyperparameter Tuning for Reinforcement Learning. RLJ 5: 2298-2319 (2024) - [j18]Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White:
Investigating the properties of neural network representations in reinforcement learning. Artif. Intell. 330: 104100 (2024) - [j17]Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White:
Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models. J. Artif. Intell. Res. 80: 441-473 (2024) - [j16]Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White:
GVFs in the real world: making predictions online for water treatment. Mach. Learn. 113(8): 5151-5181 (2024) - [j15]Lingwei Zhu, Matthew Schlegel, Han Wang, Martha White:
Offline Reinforcement Learning via Tsallis Regularization. Trans. Mach. Learn. Res. 2024 (2024) - [c63]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning (Abstract Reprint). AAAI 2024: 22706 - [c62]Brett Daley, Martha White, Marlos C. Machado:
Averaging n-step Returns Reduces Variance in Reinforcement Learning. ICML 2024 - [c61]Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas:
Position: Benchmarking is Limited in Reinforcement Learning Research. ICML 2024 - [i86]Brett Daley, Martha White, Marlos C. Machado:
Compound Returns Reduce Variance in Reinforcement Learning. CoRR abs/2402.03903 (2024) - [i85]Hugo Silva, Martha White:
What to Do When Your Discrete Optimization Is the Size of a Neural Network? CoRR abs/2402.10339 (2024) - [i84]Ehsan Imani, Kai Luedemann, Sam Scholnick-Hughes, Esraa Elelimy, Martha White:
Investigating the Histogram Loss in Regression. CoRR abs/2402.13425 (2024) - [i83]Golnaz Mesbahi, Olya Mastikhina, Parham Mohammad Panahi, Martha White, Adam White:
Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL. CoRR abs/2404.02113 (2024) - [i82]Kevin Roice, Parham Mohammad Panahi, Scott M. Jordan, Adam White, Martha White:
A New View on Planning in Online Reinforcement Learning. CoRR abs/2406.01562 (2024) - [i81]Brett Daley, Marlos C. Machado, Martha White:
Demystifying the Recency Heuristic in Temporal-Difference Learning. CoRR abs/2406.12284 (2024) - [i80]Scott M. Jordan, Adam White, Bruno Castro da Silva, Martha White, Philip S. Thomas:
Position: Benchmarking is Limited in Reinforcement Learning Research. CoRR abs/2406.16241 (2024) - [i79]Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White:
Investigating the Interplay of Prioritized Replay and Generalization. CoRR abs/2407.09702 (2024) - [i78]Andrew Patterson, Samuel Neumann, Raksha Kumaraswamy, Martha White, Adam White:
The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning. CoRR abs/2407.18840 (2024) - [i77]Lingwei Zhu, Haseeb Shah, Han Wang, Martha White:
q-exponential family for policy optimization. CoRR abs/2408.07245 (2024) - [i76]Esraa Elelimy, Adam White, Michael Bowling, Martha White:
Real-Time Recurrent Learning using Trace Units in Reinforcement Learning. CoRR abs/2409.01449 (2024) - [i75]Maximilian Bloor, José Torraca, Ilya Orson Sandoval, Akhil Ahmed, Martha White, Mehmet Mercangöz, Calvin Tsay, Ehecatl Antonio del Rio-Chanona, Max Mowbray:
PC-Gym: Benchmark Environments For Process Control Problems. CoRR abs/2410.22093 (2024) - [i74]Gautham Vasan, Mohamed Elsayed, Alireza Azimi, Jiamin He, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood:
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers. CoRR abs/2411.15370 (2024) - 2023
- [j14]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning. J. Artif. Intell. Res. 77: 71-101 (2023) - [j13]Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White:
Off-Policy Actor-Critic with Emphatic Weightings. J. Mach. Learn. Res. 24: 146:1-146:63 (2023) - [j12]Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White:
Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks. J. Mach. Learn. Res. 24: 256:1-256:34 (2023) - [j11]Andrew Patterson, Victor Liao, Martha White:
Robust Losses for Learning Value Functions. IEEE Trans. Pattern Anal. Mach. Intell. 45(5): 6157-6167 (2023) - [j10]Erfan Miahi, Revan MacQueen, Alex Ayoub, Abbas Masoumzadeh, Martha White:
Resmax: An Alternative Soft-Greedy Operator for Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [j9]Matthew Schlegel, Volodymyr Tkachuk, Adam M. White, Martha White:
Investigating Action Encodings in Recurrent Neural Networks in Reinforcement Learning. Trans. Mach. Learn. Res. 2023 (2023) - [c60]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. AISTATS 2023: 5474-5492 - [c59]Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White:
Measuring and Mitigating Interference in Reinforcement Learning. CoLLAs 2023: 781-795 - [c58]Samuel Neumann, Sungsu Lim, Ajin George Joseph, Yangchen Pan, Adam White, Martha White:
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement. ICLR 2023 - [c57]Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White:
The In-Sample Softmax for Offline Reinforcement Learning. ICLR 2023 - [c56]Brett Daley, Martha White, Christopher Amato, Marlos C. Machado:
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning. ICML 2023: 6818-6835 - [c55]Lingwei Zhu, Zheng Chen, Matthew Schlegel, Martha White:
General Munchausen Reinforcement Learning with Tsallis Kullback-Leibler Divergence. NeurIPS 2023 - [i73]Brett Daley, Martha White, Christopher Amato, Marlos C. Machado:
Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning. CoRR abs/2301.11321 (2023) - [i72]Lingwei Zhu, Zheng Chen, Takamitsu Matsubara, Martha White:
Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence. CoRR abs/2301.11476 (2023) - [i71]Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White:
Online Real-Time Recurrent Learning Using Sparse Connections and Selective Learning. CoRR abs/2302.05326 (2023) - [i70]Vincent Liu, Yash Chandak, Philip S. Thomas, Martha White:
Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments. CoRR abs/2302.11725 (2023) - [i69]Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White:
The In-Sample Softmax for Offline Reinforcement Learning. CoRR abs/2302.14372 (2023) - [i68]Andrew Patterson, Samuel Neumann, Martha White, Adam White:
Empirical Design in Reinforcement Learning. CoRR abs/2304.01315 (2023) - [i67]James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas:
Coagent Networks: Generalized and Scaled. CoRR abs/2305.09838 (2023) - [i66]Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White:
Measuring and Mitigating Interference in Reinforcement Learning. CoRR abs/2307.04887 (2023) - [i65]Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White:
GVFs in the Real World: Making Predictions Online for Water Treatment. CoRR abs/2312.01624 (2023) - [i64]Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White:
When is Offline Policy Selection Sample Efficient for Reinforcement Learning? CoRR abs/2312.02355 (2023) - 2022
- [j8]Andrew Patterson, Adam White, Martha White:
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning. J. Mach. Learn. Res. 23: 145:1-145:61 (2022) - [j7]Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White:
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. J. Mach. Learn. Res. 23: 253:1-253:79 (2022) - [j6]Ehsan Imani, Wei Hu, Martha White:
Representation Alignment in Neural Networks. Trans. Mach. Learn. Res. 2022 (2022) - [j5]Han Wang, Archit Sakhadeo, Adam M. White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White:
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. Trans. Mach. Learn. Res. 2022 (2022) - [c54]Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, Rupam Mahmood:
An Alternate Policy Gradient Estimator for Softmax Policies. AISTATS 2022: 6630-6689 - [c53]Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White:
Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum. ICLR 2022 - [c52]Samuele Tosatto, Andrew Patterson, Martha White, Rupam Mahmood:
A Temporal-Difference Approach to Policy Gradient Estimation. ICML 2022: 21609-21632 - [c51]Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand, Martha White, Hengshuai Yao, Mohsen Rohani, Jun Luo:
Understanding and mitigating the limitations of prioritized experience replay. UAI 2022: 1561-1571 - [i63]Samuele Tosatto, Andrew Patterson, Martha White, A. Rupam Mahmood:
A Temporal-Difference Approach to Policy Gradient Estimation. CoRR abs/2202.02396 (2022) - [i62]Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White:
Continual Auxiliary Task Learning. CoRR abs/2202.11133 (2022) - [i61]Kirby Banman, Liam Peet-Pare, Nidhi Hegde, Alona Fyshe, Martha White:
Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum. CoRR abs/2203.11992 (2022) - [i60]Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White:
Investigating the Properties of Neural Network Representations in Reinforcement Learning. CoRR abs/2203.15955 (2022) - [i59]Andrew Patterson, Victor Liao, Martha White:
Robust Losses for Learning Value Functions. CoRR abs/2205.08464 (2022) - [i58]Han Wang, Archit Sakhadeo, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White:
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL. CoRR abs/2205.08716 (2022) - [i57]Chunlok Lo, Gabor Mihucz, Adam White, Farzane Aminmansour, Martha White:
Goal-Space Planning with Subgoal Models. CoRR abs/2206.02902 (2022) - 2021
- [j4]Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White:
General Value Function Networks. J. Artif. Intell. Res. 70: 497-543 (2021) - [j3]Sebastian Höfer, Kostas E. Bekris, Ankur Handa, Juan Camilo Gamboa, Melissa Mozifian, Florian Golemo, Christopher G. Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C. Karen Liu, Jan Peters, Shuran Song, Peter Welinder, Martha White:
Sim2Real in Robotics and Automation: Applications and Challenges. IEEE Trans Autom. Sci. Eng. 18(2): 398-400 (2021) - [c50]Yangchen Pan, Kirby Banman, Martha White:
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online. ICLR 2021 - [c49]Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White:
Continual Auxiliary Task Learning. NeurIPS 2021: 12549-12562 - [c48]Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James E. Kostas, Philip S. Thomas, Martha White:
Structural Credit Assignment in Neural Networks using Reinforcement Learning. NeurIPS 2021: 30257-30270 - [i56]Khurram Javed, Martha White, Richard S. Sutton:
Scalable Online Recurrent Learning Using Columnar Neural Networks. CoRR abs/2103.05787 (2021) - [i55]Andrew Patterson, Adam White, Sina Ghiassian, Martha White:
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning. CoRR abs/2104.13844 (2021) - [i54]Qingfeng Lan, Luke Kumar, Martha White, Alona Fyshe:
Predictive Representation Learning for Language Modeling. CoRR abs/2105.14214 (2021) - [i53]Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White:
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences. CoRR abs/2107.08285 (2021) - [i52]Vincent Liu, James R. Wright, Martha White:
Exploiting Action Impact Regularity and Partially Known Models for Offline Reinforcement Learning. CoRR abs/2111.08066 (2021) - [i51]Eric Graves, Ehsan Imani, Raksha Kumaraswamy, Martha White:
Off-Policy Actor-Critic with Emphatic Weightings. CoRR abs/2111.08172 (2021) - [i50]Ehsan Imani, Wei Hu, Martha White:
Understanding Feature Transfer Through Representation Alignment. CoRR abs/2112.07806 (2021) - [i49]Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, A. Rupam Mahmood:
An Alternate Policy Gradient Estimator for Softmax Policies. CoRR abs/2112.11622 (2021) - 2020
- [j2]Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White:
Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study. J. Artif. Intell. Res. 69: 1287-1332 (2020) - [c47]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Rewards. AAMAS 2020: 1215-1223 - [c46]Maryam Hashemzadeh, Greta Kaufeld, Martha White, Andrea E. Martin, Alona Fyshe:
From Language to Language-ish: How Brain-Like is an LSTM's Representation of Atypical Language Stimuli? EMNLP (Findings) 2020: 645-656 - [c45]Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White:
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning. ICLR 2020 - [c44]Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White:
Training Recurrent Neural Networks Online by Learning Explicit State Variables. ICLR 2020 - [c43]Zaheer Abbas, Samuel Sokota, Erin Talvitie, Martha White:
Selective Dyna-Style Planning Under Limited Model Capacity. ICML 2020: 1-10 - [c42]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. ICML 2020: 1414-1425 - [c41]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. ICML 2020: 3524-3534 - [c40]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. NeurIPS 2020 - [c39]Yangchen Pan, Ehsan Imani, Amir-massoud Farahmand, Martha White:
An implicit function learning approach for parametric modal regression. NeurIPS 2020 - [i48]Yangchen Pan, Ehsan Imani, Martha White, Amir-massoud Farahmand:
An implicit function learning approach for parametric modal regression. CoRR abs/2002.06195 (2020) - [i47]Qingfeng Lan, Yangchen Pan, Alona Fyshe, Martha White:
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning. CoRR abs/2002.06487 (2020) - [i46]Yash Satsangi, Sungsu Lim, Shimon Whiteson, Frans A. Oliehoek, Martha White:
Maximizing Information Gain in Partially Observable Environments via Prediction Reward. CoRR abs/2005.04912 (2020) - [i45]Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas:
Optimizing for the Future in Non-Stationary MDPs. CoRR abs/2005.08158 (2020) - [i44]Taher Jafferjee, Ehsan Imani, Erin Talvitie, Martha White, Michael Bowling:
Hallucinating Value: A Pitfall of Dyna-style Planning with Imperfect Environment Models. CoRR abs/2006.04363 (2020) - [i43]Khurram Javed, Martha White, Yoshua Bengio:
Learning Causal Models Online. CoRR abs/2006.07461 (2020) - [i42]Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White:
Gradient Temporal-Difference Learning with Regularized Corrections. CoRR abs/2007.00611 (2020) - [i41]Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White:
Selective Dyna-style Planning Under Limited Model Capacity. CoRR abs/2007.02418 (2020) - [i40]Vincent Liu, Adam White, Hengshuai Yao, Martha White:
Towards a practical measure of interference for reinforcement learning. CoRR abs/2007.03807 (2020) - [i39]Jincheng Mei, Yangchen Pan, Martha White, Amir-massoud Farahmand, Hengshuai Yao:
Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities. CoRR abs/2007.09569 (2020) - [i38]Maryam Hashemzadeh, Greta Kaufeld, Martha White, Andrea E. Martin, Alona Fyshe:
From Language to Language-ish: How Brain-Like is an LSTM's Representation of Nonsensical Language Stimuli? CoRR abs/2010.07435 (2020) - [i37]Yash Chandak, Scott M. Jordan, Georgios Theocharous, Martha White, Philip S. Thomas:
Towards Safe Policy Improvement for Non-Stationary MDPs. CoRR abs/2010.12645 (2020) - [i36]Sebastian Höfer, Kostas E. Bekris, Ankur Handa, Juan Camilo Gamboa Higuera, Florian Golemo, Melissa Mozifian, Christopher G. Atkeson, Dieter Fox, Ken Goldberg, John Leonard, C. Karen Liu, Jan Peters, Shuran Song, Peter Welinder, Martha White:
Perspectives on Sim2Real Transfer for Robotics: A Summary of the R: SS 2020 Workshop. CoRR abs/2012.03806 (2020)
2010 – 2019
- 2019
- [c38]Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White:
Meta-Descent for Online, Continual Prediction. AAAI 2019: 3943-3950 - [c37]Vincent Liu, Raksha Kumaraswamy, Lei Le, Martha White:
The Utility of Sparse Representations for Control in Reinforcement Learning. AAAI 2019: 4384-4391 - [c36]Wesley Chung, Somjit Nath, Ajin Joseph, Martha White:
Two-Timescale Networks for Nonlinear Value Function Approximation. ICLR (Poster) 2019 - [c35]Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White:
Hill Climbing on Value Estimates for Search-control in Dyna. IJCAI 2019: 3209-3215 - [c34]Yi Wan, Muhammad Zaheer, Adam White, Martha White, Richard S. Sutton:
Planning with Expectation Models. IJCAI 2019: 3649-3655 - [c33]Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White:
Importance Resampling for Off-policy Prediction. NeurIPS 2019: 1797-1807 - [c32]Khurram Javed, Martha White:
Meta-Learning Representations for Continual Learning. NeurIPS 2019: 1818-1828 - [c31]Farzane Aminmansour, Andrew Patterson, Lei Le, Yisu Peng, Daniel Mitchell, Franco Pestilli, Cesar F. Caiafa, Russell Greiner, Martha White:
Learning Macroscopic Brain Connectomes via Group-Sparse Factorization. NeurIPS 2019: 8847-8857 - [i35]Yi Wan, Muhammad Zaheer, Adam White, Martha White, Richard S. Sutton:
Planning with Expectation Models. CoRR abs/1904.01191 (2019) - [i34]Khurram Javed, Martha White:
Meta-Learning Representations for Continual Learning. CoRR abs/1905.12588 (2019) - [i33]Matthew Schlegel, Wesley Chung, Daniel Graves, Jian Qian, Martha White:
Importance Resampling for Off-policy Prediction. CoRR abs/1906.04328 (2019) - [i32]Yangchen Pan, Hengshuai Yao, Amir-massoud Farahmand, Martha White:
Hill Climbing on Value Estimates for Search-control in Dyna. CoRR abs/1906.07791 (2019) - [i31]Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White:
Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study. CoRR abs/1906.07865 (2019) - [i30]Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White:
Meta-descent for Online, Continual Prediction. CoRR abs/1907.07751 (2019) - [i29]Khurram Javed, Hengshuai Yao, Martha White:
Is Fast Adaptation All You Need? CoRR abs/1910.01705 (2019) - 2018
- [c30]Ehsan Imani, Martha White:
Improving Regression Performance with Distributional Losses. ICML 2018: 2162-2171 - [c29]Yangchen Pan, Amir-massoud Farahmand, Martha White, Saleh Nabi, Piyush Grover, Daniel Nikovski:
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control. ICML 2018: 3983-3992 - [c28]Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White:
Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains. IJCAI 2018: 4794-4800 - [c27]Ehsan Imani, Eric Graves, Martha White:
An Off-policy Policy Gradient Theorem Using Emphatic Weightings. NeurIPS 2018: 96-106 - [c26]Lei Le, Andrew Patterson, Martha White:
Supervised autoencoders: Improving generalization performance with unsupervised regularizers. NeurIPS 2018: 107-117 - [c25]Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White:
Context-dependent upper-confidence bounds for directed exploration. NeurIPS 2018: 4784-4794 - [c24]Craig Sherstan, Dylan R. Ashley, Brendan Bennett, Kenny Young, Adam White, Martha White, Richard S. Sutton:
Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return. UAI 2018: 63-72 - [c23]