default search action
Yuanzhao Zhai
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j7]Hongda Jia, Zijian Gao, Cheng Yang, Bo Ding, Yuanzhao Zhai, Huaimin Wang:
C3F: Constant Collaboration and Communication Framework for Graph-Representation Dynamic Multi-Robotic Systems. IEEE Robotics Autom. Lett. 9(1): 95-102 (2024) - [j6]Chao Chen, Yuanzhao Zhai, Zijian Gao, Kele Xu, Sen Yang, Yiying Li, Bo Ding, Dawei Feng, Huaimin Wang:
Nuclear Norm Maximization-Based Curiosity-Driven Reinforcement Learning. IEEE Trans. Artif. Intell. 5(5): 2410-2421 (2024) - [j5]Zijian Gao, Yiying Li, Kele Xu, Yuanzhao Zhai, Bo Ding, Dawei Feng, Xinjun Mao, Huaimin Wang:
Dynamic Memory-Based Curiosity: A Bootstrap Approach for Exploration in Reinforcement Learning. IEEE Trans. Emerg. Top. Comput. Intell. 8(2): 1181-1193 (2024) - [c8]Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization. AAAI 2024: 16678-16686 - [c7]Huanxi Liu, Yuanzhao Zhai, Kele Xu, Dawei Feng, Yiying Li:
Nuclear-Norm Maximization for Low-Rank Updates. ICASSP 2024: 11101-11105 - [c6]Xudong Gong, Dawei Feng, Kele Xu, Yuanzhao Zhai, Chengkang Yao, Weijia Wang, Bo Ding, Huaimin Wang:
Iterative Regularized Policy Optimization with Imperfect Demonstrations. ICML 2024 - [i9]Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles. CoRR abs/2401.00243 (2024) - [i8]Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization. CoRR abs/2401.05899 (2024) - [i7]Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu:
COPR: Continual Human Preference Learning via Optimal Policy Regularization. CoRR abs/2402.14228 (2024) - [i6]Yuanzhao Zhai, Zhuo Zhang, Kele Xu, Hanyang Peng, Yue Yu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang:
Online Self-Preferring Language Models. CoRR abs/2405.14103 (2024) - [i5]Yuanzhao Zhai, Tingkai Yang, Kele Xu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang:
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models. CoRR abs/2409.09345 (2024) - 2023
- [c5]Shaoqi Sun, Yuanzhao Zhai, Kele Xu, Dawei Feng, Bo Ding:
Progressive Diversifying Policy for Multi-Agent Reinforcement Learning. ICASSP 2023: 1-5 - [c4]Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang:
Diversifying Message Aggregation in Multi-Agent Communication Via Normalized Tensor Nuclear Norm Regularization. ICASSP 2023: 1-5 - [i4]Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu:
COPF: Continual Learning Human Preference through Optimal Policy Fitting. CoRR abs/2310.15694 (2023) - 2022
- [j4]Jie Luo, Xiao Shu, Yuanzhao Zhai, Xiang Fu, Bo Ding, Jie Xu:
A Fast and Robust Solution for Common Knowledge Formation in Decentralized Swarm Robots. J. Intell. Robotic Syst. 106(4): 68 (2022) - [j3]Hongda Jia, Yong Zhao, Yuanzhao Zhai, Bo Ding, Huaimin Wang, Qingtong Wu:
CRMRL: Collaborative Relationship Meta Reinforcement Learning for Effectively Adapting to Type Changes in Multi-Robotic System. IEEE Robotics Autom. Lett. 7(4): 11362-11369 (2022) - [c3]Qingtong Wu, Dawei Feng, Yuanzhao Zhai, Bo Ding, Jie Luo:
Pseudo Reward and Action Importance Classification for Sparse Reward Problem. ICMLC 2022: 51-58 - [c2]Yanqiang Zhang, Yuanzhao Zhai, Gongqian Zhou, Bo Ding, Dawei Feng, Songwang Liu:
Exploring Policy Diversity in Parallel Actor-Critic Learning. ICTAI 2022: 1196-1203 - [i3]Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang:
Diversifying Message Aggregation in Multi-Agent Communication via Normalized Tensor Nuclear Norm Regularization. CoRR abs/2208.05414 (2022) - [i2]Zijian Gao, Kele Xu, Yiying Li, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang:
Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration. CoRR abs/2208.11349 (2022) - [i1]Zijian Gao, Kele Xu, Hengxing Cai, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang:
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning. CoRR abs/2208.11361 (2022) - 2021
- [j2]Yuanzhao Zhai, Bo Ding, Xuan Liu, Hongda Jia, Yong Zhao, Jie Luo:
Decentralized Multi-Robot Collision Avoidance in Complex Scenarios With Selective Communication. IEEE Robotics Autom. Lett. 6(4): 8379-8386 (2021) - [j1]Yuanzhao Zhai, Bo Ding, Pengfei Zhang, Jie Luo:
Cloudroid Swarm: A QoS-Aware Framework for Multirobot Cooperation Offloading. Wirel. Commun. Mob. Comput. 2021: 6631111:1-6631111:18 (2021) - [c1]Yong Zhao, Yuanzhao Zhai, Jie Luo, Dawei Feng, Bo Ding, Zhen Li:
Accelerating Robot Reinforcement Learning with Samples of Different Simulation Precision. HPCC/DSS/SmartCity/DependSys 2021: 423-430
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-15 00:21 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint