default search action
Tengyang Xie
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c16]Jianrui Zhang, Mu Cai, Tengyang Xie, Yong Jae Lee:
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples. ACL (Findings) 2024: 15481-15495 - [c15]Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie:
Harnessing Density Ratios for Online Reinforcement Learning. ICLR 2024 - [c14]Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford:
Towards Principled Representation Learning from Videos for Reinforcement Learning. ICLR 2024 - [c13]Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar:
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data. ICML 2024 - [i25]Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie:
Harnessing Density Ratios for Online Reinforcement Learning. CoRR abs/2401.09681 (2024) - [i24]Jianrui Zhang, Mu Cai, Tengyang Xie, Yong Jae Lee:
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples. CoRR abs/2402.13254 (2024) - [i23]Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford:
Towards Principled Representation Learning from Videos for Reinforcement Learning. CoRR abs/2403.13765 (2024) - [i22]Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie:
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences. CoRR abs/2404.03715 (2024) - [i21]Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar:
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data. CoRR abs/2404.14367 (2024) - [i20]Tengyang Xie, Dylan J. Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin:
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF. CoRR abs/2405.21046 (2024) - [i19]Xiang Ji, Sanjeev Kulkarni, Mengdi Wang, Tengyang Xie:
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models. CoRR abs/2406.04274 (2024) - [i18]Haoxiang Wang, Wei Xiong, Tengyang Xie, Han Zhao, Tong Zhang:
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts. CoRR abs/2406.12845 (2024) - [i17]Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J. Foster:
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization. CoRR abs/2407.13399 (2024) - 2023
- [c12]Tengyang Xie, Dylan J. Foster, Yu Bai, Nan Jiang, Sham M. Kakade:
The Role of Coverage in Online Reinforcement Learning. ICLR 2023 - [c11]Mohak Bhardwaj, Tengyang Xie, Byron Boots, Nan Jiang, Ching-An Cheng:
Adversarial Model for Offline Reinforcement Learning. NeurIPS 2023 - [i16]Mohak Bhardwaj, Tengyang Xie, Byron Boots, Nan Jiang, Ching-An Cheng:
Adversarial Model for Offline Reinforcement Learning. CoRR abs/2302.11048 (2023) - 2022
- [c10]Ching-An Cheng, Tengyang Xie, Nan Jiang, Alekh Agarwal:
Adversarially Trained Actor Critic for Offline Reinforcement Learning. ICML 2022: 3852-3878 - [c9]Tengyang Xie, Akanksha Saran, Dylan J. Foster, Lekan P. Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford:
Interaction-Grounded Learning with Action-Inclusive Feedback. NeurIPS 2022 - [i15]Ching-An Cheng, Tengyang Xie, Nan Jiang, Alekh Agarwal:
Adversarially Trained Actor Critic for Offline Reinforcement Learning. CoRR abs/2202.02446 (2022) - [i14]Tengyang Xie, Akanksha Saran, Dylan J. Foster, Lekan P. Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford:
Interaction-Grounded Learning with Action-inclusive Feedback. CoRR abs/2206.08364 (2022) - [i13]Tengyang Xie, Dylan J. Foster, Yu Bai, Nan Jiang, Sham M. Kakade:
The Role of Coverage in Online Reinforcement Learning. CoRR abs/2210.04157 (2022) - [i12]Tengyang Xie, Mohak Bhardwaj, Nan Jiang, Ching-An Cheng:
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data. CoRR abs/2211.04538 (2022) - 2021
- [c8]Tengyang Xie, Nan Jiang:
Batch Value-function Approximation with Only Realizability. ICML 2021: 11404-11413 - [c7]Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad:
Interaction-Grounded Learning. ICML 2021: 11414-11423 - [c6]Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal:
Bellman-consistent Pessimism for Offline Reinforcement Learning. NeurIPS 2021: 6683-6694 - [c5]Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai:
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning. NeurIPS 2021: 27395-27407 - [i11]Masatoshi Uehara, Masaaki Imaizumi, Nan Jiang, Nathan Kallus, Wen Sun, Tengyang Xie:
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency. CoRR abs/2102.02981 (2021) - [i10]Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad:
Interaction-Grounded Learning. CoRR abs/2106.04887 (2021) - [i9]Tengyang Xie, Nan Jiang, Huan Wang, Caiming Xiong, Yu Bai:
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning. CoRR abs/2106.04895 (2021) - [i8]Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal:
Bellman-consistent Pessimism for Offline Reinforcement Learning. CoRR abs/2106.06926 (2021) - 2020
- [c4]Tengyang Xie, Nan Jiang:
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison. UAI 2020: 550-559 - [i7]Tengyang Xie, Nan Jiang:
Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison. CoRR abs/2003.03924 (2020) - [i6]Tengyang Xie, Nan Jiang:
Batch Value-function Approximation with Only Realizability. CoRR abs/2008.04990 (2020) - [i5]Philip Amortila, Nan Jiang, Tengyang Xie:
A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting. CoRR abs/2011.01075 (2020)
2010 – 2019
- 2019
- [c3]Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang:
Provably Efficient Q-Learning with Low Switching Cost. NeurIPS 2019: 8002-8011 - [c2]Tengyang Xie, Yifei Ma, Yu-Xiang Wang:
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling. NeurIPS 2019: 9665-9675 - [i4]Tengyang Xie, Philip S. Thomas, Gerome Miklau:
Privacy Preserving Off-Policy Evaluation. CoRR abs/1902.00174 (2019) - [i3]Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang:
Provably Efficient Q-Learning with Low Switching Cost. CoRR abs/1905.12849 (2019) - [i2]Tengyang Xie, Yifei Ma, Yu-Xiang Wang:
Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling. CoRR abs/1906.03393 (2019) - 2018
- [c1]Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon:
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization. NeurIPS 2018: 1073-1083 - [i1]Bo Liu, Tengyang Xie, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon:
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization. CoRR abs/1809.02292 (2018)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-26 01:53 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint