default search action
Alec Koppel
- > Home > Persons > Alec Koppel
Publications
- 2024
- [j30]Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel:
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control. J. Mach. Learn. Res. 25: 39:1-39:58 (2024) - [c76]Souradip Chakraborty, Amrit S. Bedi, Alec Koppel, Huazheng Wang, Dinesh Manocha, Mengdi Wang, Furong Huang:
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback. ICLR 2024 - [c74]Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Dinesh Manocha, Furong Huang, Amrit S. Bedi, Mengdi Wang:
MaxMin-RLHF: Alignment with Diverse Human Preferences. ICML 2024 - [c72]Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Dinesh Manocha, Amrit S. Bedi:
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles. ICML 2024 - [i57]Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Singh Bedi, Mengdi Wang:
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences. CoRR abs/2402.08925 (2024) - [i56]Peihong Yu, Manav Mishra, Alec Koppel, Carl E. Busart, Priya Narayan, Dinesh Manocha, Amrit S. Bedi, Pratap Tokekar:
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning. CoRR abs/2403.08936 (2024) - [i54]Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Amrit Singh Bedi, Dinesh Manocha:
Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic. CoRR abs/2403.11925 (2024) - [i51]Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit S. Bedi, Furong Huang:
SAIL: Self-Improving Efficient Online Alignment of Large Language Models. CoRR abs/2406.15567 (2024) - 2023
- [c70]Souradip Chakraborty, Amrit Singh Bedi, Pratap Tokekar, Alec Koppel, Brian M. Sadler, Furong Huang, Dinesh Manocha:
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning. AAAI 2023: 6980-6988 - [c67]Hans He, Alec Koppel, Amrit Singh Bedi, Mazen Farhood, Daniel J. Stilwell:
Bi-Level Nonstationary Kernels for Online Gaussian Process Regression. CASE 2023: 1-7 - [c65]Souradip Chakraborty, Amrit S. Bedi, Alec Koppel, Mengdi Wang, Furong Huang, Dinesh Manocha:
STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning. ICML 2023: 3949-3978 - [c64]Wesley A. Suttle, Amrit S. Bedi, Bhrij Patel, Brian M. Sadler, Alec Koppel, Dinesh Manocha:
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic. ICML 2023: 33240-33267 - [c63]Souradip Chakraborty, Amrit Singh Bedi, Kasun Weerakoon, Prithvi Poddar, Alec Koppel, Pratap Tokekar, Dinesh Manocha:
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policy Optimization. ICRA 2023: 989-995 - [c62]Hans He, Alec Koppel, Amrit Singh Bedi, Daniel J. Stilwell, Mazen Farhood, Benjamin Biggs:
Decentralized Multi-agent Exploration with Limited Inter-agent Communications. ICRA 2023: 5530-5536 - [i50]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Mengdi Wang, Furong Huang, Dinesh Manocha:
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning. CoRR abs/2301.12038 (2023) - [i49]Wesley A. Suttle, Amrit Singh Bedi, Bhrij Patel, Brian M. Sadler, Alec Koppel, Dinesh Manocha:
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic. CoRR abs/2301.12083 (2023) - [i44]Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Amrit Singh Bedi, Dinesh Manocha:
Ada-NAV: Adaptive Trajectory-Based Sample Efficient Policy Learning for Robotic Navigation. CoRR abs/2306.06192 (2023) - [i42]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Furong Huang, Mengdi Wang:
Aligning Agent Policy with Externalities: Reward Design via Bilevel RL. CoRR abs/2308.02585 (2023) - 2022
- [j25]Amrit Singh Bedi, Ketan Rajawat, Vaneet Aggarwal, Alec Koppel:
Escaping Saddle Points for Successive Convex Approximation. IEEE Trans. Signal Process. 70: 307-321 (2022) - [c60]Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, Alec Koppel, Vaneet Aggarwal:
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach. AAAI 2022: 3682-3689 - [c59]Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel:
Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic. AAAI 2022: 9031-9039 - [c57]Alec Koppel, Amrit Singh Bedi, Bhargav Ganguly, Vaneet Aggarwal:
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming. CDC 2022: 4545-4552 - [c54]Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian M. Sadler, Pratap Tokekar, Alec Koppel:
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces. ICML 2022: 1716-1731 - [c52]Yulun Tian, Amrit Singh Bedi, Alec Koppel, Miguel Calvo-Fullana, David M. Rosen, Jonathan P. How:
Distributed Riemannian Optimization with Lazy Communication for Collaborative Geometric Estimation. IROS 2022: 4391-4398 - [i37]Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian M. Sadler, Pratap Tokekar, Alec Koppel:
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces. CoRR abs/2201.12332 (2022) - [i36]Yulun Tian, Amrit Singh Bedi, Alec Koppel, Miguel Calvo-Fullana, David M. Rosen, Jonathan P. How:
Distributed Riemannian Optimization with Lazy Communication for Collaborative Geometric Estimation. CoRR abs/2203.00851 (2022) - [i35]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Furong Huang, Pratap Tokekar, Dinesh Manocha:
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning. CoRR abs/2206.01162 (2022) - [i34]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Pratap Tokekar, Dinesh Manocha:
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies. CoRR abs/2206.05652 (2022) - [i33]Amrit Singh Bedi, Chen Fan, Alec Koppel, Anit Kumar Sahu, Brian M. Sadler, Furong Huang, Dinesh Manocha:
FedBC: Calibrating Global and Local Models via Federated Learning Beyond Consensus. CoRR abs/2206.10815 (2022) - 2021
- [j22]Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel:
Cautious Reinforcement Learning via Distributional Risk in the Dual Domain. IEEE J. Sel. Areas Inf. Theory 2(2): 611-626 (2021) - [j19]Hrusikesha Pradhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Adaptive Kernel Learning in Heterogeneous Networks. IEEE Trans. Signal Inf. Process. over Networks 7: 423-437 (2021) - [j18]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Panchajanya Sanyal:
Nonparametric Compositional Stochastic Optimization for Risk-Sensitive Kernel Learning. IEEE Trans. Signal Process. 69: 428-442 (2021) - [j17]Deepak S. Kalhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Hamed Hassani, Abhishek K. Gupta, Adrish Banerjee:
Dynamic Online Learning via Frank-Wolfe Algorithm. IEEE Trans. Signal Process. 69: 932-947 (2021) - [j16]Alec Koppel, Amrit Singh Bedi, Brian M. Sadler, Víctor Elvira:
Nearly Consistent Finite Particle Estimates in Streaming Importance Sampling. IEEE Trans. Signal Process. 69: 6401-6415 (2021) - [c49]Alec Koppel, Amrit Singh Bedi, Bhargav Ganguly, Vaneet Aggarwal:
Randomized Linear Programming for Tabular Average-Cost Multi-agent Reinforcement Learning. ACSCC 2021: 1023-1026 - [c48]Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel:
Beyond Cumulative Returns via Reinforcement Learning over State-Action Occupancy Measures. ACC 2021: 894-901 - [c47]Anjaly Parayil, Amrit Singh Bedi, Alec Koppel:
Joint Position and Beamforming Control via Alternating Nonlinear Least-Squares with a Hierarchical Gamma Prior. ACC 2021: 3513-3518 - [c46]Amrit Singh Bedi, Alec Koppel, Mengdi Wang, Junyu Zhang:
Intermittent Communications in Decentralized Shadow Reward Actor-Critic. CDC 2021: 2613-2620 - [c44]Alec Koppel, Amrit S. Bedi, Vikram Krishnamurthy:
A Dynamical Systems Perspective on Online Bayesian Nonparametric Estimators with Adaptive Hyperparameters. ICASSP 2021: 2975-2979 - [c43]Michael E. Kepler, Alec Koppel, Amrit Singh Bedi, Daniel J. Stilwell:
Wasserstein-Splitting Gaussian Process Regression for Heterogeneous Online Bayesian Inference. IROS 2021: 9833-9840 - [i29]Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel:
MARL with General Utilities via Decentralized Shadow Reward Actor-Critic. CoRR abs/2106.00543 (2021) - [i28]Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel:
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control. CoRR abs/2106.08414 (2021) - [i27]Michael E. Kepler, Alec Koppel, Amrit Singh Bedi, Daniel J. Stilwell:
Wasserstein-Splitting Gaussian Process Regression for Heterogeneous Online Bayesian Inference. CoRR abs/2107.12797 (2021) - [i26]Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, Alec Koppel, Vaneet Aggarwal:
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach. CoRR abs/2109.06332 (2021) - 2020
- [j14]Yulun Tian, Alec Koppel, Amrit Singh Bedi, Jonathan P. How:
Asynchronous and Parallel Distributed Pose Graph Optimization. IEEE Robotics Autom. Lett. 5(4): 5819-5826 (2020) - [j12]Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler:
Optimally Compressed Nonparametric Online Learning: Tradeoffs between memory and consistency. IEEE Signal Process. Mag. 37(3): 61-70 (2020) - [c42]Hrusikesha Pradhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Conservative Multi-agent Online Kernel Learning in Heterogeneous Networks. ACSSC 2020: 53-57 - [c41]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Brian M. Sadler:
Trading Dynamic Regret for Model Complexity in Nonstationary Nonparametric Optimization. ACC 2020: 321-326 - [c39]Deepak S. Kalhan, Amrit S. Bedi, Alec Koppel, Ketan Rajawat, Abhishek K. Gupta, Adrish Banerjee:
Projection Free Dynamic Online Learning. ICASSP 2020: 3957-3961 - [c36]Amrit Singh Bedi, Dheeraj Peddireddy, Vaneet Aggarwal, Alec Koppel:
Efficient Large-Scale Gaussian Process Bandits by Believing only Informative Actions. L4DC 2020: 924-934 - [c35]Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang:
Variational Policy Gradient Method for Reinforcement Learning with General Utilities. NeurIPS 2020 - [i24]Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel:
Cautious Reinforcement Learning via Distributional Risk in the Dual Domain. CoRR abs/2002.12475 (2020) - [i23]Yulun Tian, Alec Koppel, Amrit Singh Bedi, Jonathan P. How:
Asynchronous and Parallel Distributed Pose Graph Optimization. CoRR abs/2003.03281 (2020) - [i22]Amrit Singh Bedi, Dheeraj Peddireddy, Vaneet Aggarwal, Alec Koppel:
Efficient Gaussian Process Bandits by Believing only Informative Actions. CoRR abs/2003.10550 (2020) - [i17]Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvári, Mengdi Wang:
Variational Policy Gradient Method for Reinforcement Learning with General Utilities. CoRR abs/2007.02151 (2020) - 2019
- [j9]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Asynchronous Online Learning in Multi-Agent Systems With Proximity Constraints. IEEE Trans. Signal Inf. Process. over Networks 5(3): 479-494 (2019) - [j8]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Asynchronous Saddle Point Algorithm for Stochastic Optimization in Heterogeneous Networks. IEEE Trans. Signal Process. 67(7): 1742-1757 (2019) - [c34]Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Víctor Elvira:
Compressed Streaming Importance Sampling for Efficient Representations of Localization Distributions. ACSSC 2019: 477-481 - [c32]Alec Koppel, Amrit S. Bedi, Ketan Rajawat:
Controlling the Bias-Variance Tradeoff via Coherent Risk for Robust Learning with Kernels. ACC 2019: 3519-3525 - [c31]Rishabh Dixit, Amrit Singh Bedi, Ketan Rajawat, Alec Koppel:
Distributed Online Learning over Time-varying Graphs via Proximal Gradient Descent. CDC 2019: 2745-2751 - [i13]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Brian M. Sadler:
Nonstationary Nonparametric Online Learning: Balancing Dynamic Regret and Model Parsimony. CoRR abs/1909.05442 (2019) - [i12]Alec Koppel, Amrit Singh Bedi, Victor Elvira, Brian M. Sadler:
Approximate Shannon Sampling in Importance Sampling: Nearly Consistent Finite Particle Estimates. CoRR abs/1909.10279 (2019) - [i11]Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler:
Optimally Compressed Nonparametric Online Learning. CoRR abs/1909.11555 (2019) - 2018
- [c23]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Asynchronous Saddle Point Method: Interference Management Through Pricing. CDC 2018: 3229-3235 - [c21]Hrusikesha Pradhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Exact Nonparametric Decentralized Online Optimization. GlobalSIP 2018: 643-647 - 2017
- [c18]Amrit Singh Bedi, Alec Koppel, Ketan Rajawat:
Beyond consensus and synchrony in decentralized online optimization using saddle point method. ACSSC 2017: 293-297
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-18 01:09 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint