default search action

combined dblp search
author search
venue search
publication search

ask others

Yuanzhao Zhai

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/ral/JiaGYDZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ral/JiaGYDZW24
Hongda Jia, Zijian Gao, Cheng Yang, Bo Ding, Yuanzhao Zhai, Huaimin Wang:
C3F: Constant Collaboration and Communication Framework for Graph-Representation Dynamic Multi-Robotic Systems. IEEE Robotics Autom. Lett. 9(1): 95-102 (2024)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/tai/ChenZGXYLDFW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tai/ChenZGXYLDFW24
Chao Chen, Yuanzhao Zhai, Zijian Gao, Kele Xu, Sen Yang, Yiying Li, Bo Ding, Dawei Feng, Huaimin Wang:
Nuclear Norm Maximization-Based Curiosity-Driven Reinforcement Learning. IEEE Trans. Artif. Intell. 5(5): 2410-2421 (2024)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/tetci/GaoLXZDFMW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tetci/GaoLXZDFMW24
Zijian Gao, Yiying Li, Kele Xu, Yuanzhao Zhai, Bo Ding, Dawei Feng, Xinjun Mao, Huaimin Wang:
Dynamic Memory-Based Curiosity: A Bootstrap Approach for Exploration in Reinforcement Learning. IEEE Trans. Emerg. Top. Comput. Intell. 8(2): 1181-1193 (2024)
[c8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/ZhaiLGGXFDW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/ZhaiLGGXFDW24
Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization. AAAI 2024: 16678-16686
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuZXFL24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuZXFL24
Huanxi Liu, Yuanzhao Zhai, Kele Xu, Dawei Feng, Yiying Li:
Nuclear-Norm Maximization for Low-Rank Updates. ICASSP 2024: 11101-11105
[c6]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/GongFXZYWDW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/GongFXZYWDW24
Xudong Gong, Dawei Feng, Kele Xu, Yuanzhao Zhai, Chengkang Yao, Weijia Wang, Bo Ding, Huaimin Wang:
Iterative Regularized Policy Optimization with Imperfect Demonstrations. ICML 2024
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-00243
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-00243
Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles. CoRR abs/2401.00243 (2024)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-05899
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-05899
Yuanzhao Zhai, Yiying Li, Zijian Gao, Xudong Gong, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang:
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization. CoRR abs/2401.05899 (2024)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-14228
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-14228
Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu:
COPR: Continual Human Preference Learning via Optimal Policy Regularization. CoRR abs/2402.14228 (2024)
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-14103
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-14103
Yuanzhao Zhai, Zhuo Zhang, Kele Xu, Hanyang Peng, Yue Yu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang:
Online Self-Preferring Language Models. CoRR abs/2405.14103 (2024)
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09345
Yuanzhao Zhai, Tingkai Yang, Kele Xu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang:
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models. CoRR abs/2409.09345 (2024)
2023
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZXFD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZXFD23
Shaoqi Sun, Yuanzhao Zhai, Kele Xu, Dawei Feng, Bo Ding:
Progressive Diversifying Policy for Multi-Agent Reinforcement Learning. ICASSP 2023: 1-5
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaiXDFGW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaiXDFGW23
Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang:
Diversifying Message Aggregation in Multi-Agent Communication Via Normalized Tensor Nuclear Norm Regularization. ICASSP 2023: 1-5
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-15694
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-15694
Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu:
COPF: Continual Learning Human Preference through Optimal Policy Fitting. CoRR abs/2310.15694 (2023)
2022
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/jirs/LuoSZFDX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jirs/LuoSZFDX22
Jie Luo, Xiao Shu, Yuanzhao Zhai, Xiang Fu, Bo Ding, Jie Xu:
A Fast and Robust Solution for Common Knowledge Formation in Decentralized Swarm Robots. J. Intell. Robotic Syst. 106(4): 68 (2022)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/ral/JiaZZDWW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ral/JiaZZDWW22
Hongda Jia, Yong Zhao, Yuanzhao Zhai, Bo Ding, Huaimin Wang, Qingtong Wu:
CRMRL: Collaborative Relationship Meta Reinforcement Learning for Effectively Adapting to Type Changes in Multi-Robotic System. IEEE Robotics Autom. Lett. 7(4): 11362-11369 (2022)
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/icmlc2/WuFZDL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmlc2/WuFZDL22
Qingtong Wu, Dawei Feng, Yuanzhao Zhai, Bo Ding, Jie Luo:
Pseudo Reward and Action Importance Classification for Sparse Reward Problem. ICMLC 2022: 51-58
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/ictai/ZhangZZDFL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ictai/ZhangZZDFL22
Yanqiang Zhang, Yuanzhao Zhai, Gongqian Zhou, Bo Ding, Dawei Feng, Songwang Liu:
Exploring Policy Diversity in Parallel Actor-Critic Learning. ICTAI 2022: 1196-1203
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-05414
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-05414
Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang:
Diversifying Message Aggregation in Multi-Agent Communication via Normalized Tensor Nuclear Norm Regularization. CoRR abs/2208.05414 (2022)
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-11349
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-11349
Zijian Gao, Kele Xu, Yiying Li, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang:
Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration. CoRR abs/2208.11349 (2022)
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-11361
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-11361
Zijian Gao, Kele Xu, Hengxing Cai, Yuanzhao Zhai, Dawei Feng, Bo Ding, XinJun Mao, Huaimin Wang:
Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning. CoRR abs/2208.11361 (2022)
2021
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/ral/ZhaiDLJZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ral/ZhaiDLJZL21
Yuanzhao Zhai, Bo Ding, Xuan Liu, Hongda Jia, Yong Zhao, Jie Luo:
Decentralized Multi-Robot Collision Avoidance in Complex Scenarios With Selective Communication. IEEE Robotics Autom. Lett. 6(4): 8379-8386 (2021)
[j1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/wicomm/ZhaiDZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/wicomm/ZhaiDZL21
Yuanzhao Zhai, Bo Ding, Pengfei Zhang, Jie Luo:
Cloudroid Swarm: A QoS-Aware Framework for Multirobot Cooperation Offloading. Wirel. Commun. Mob. Comput. 2021: 6631111:1-6631111:18 (2021)
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/hpcc/ZhaoZLFDL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/hpcc/ZhaoZLFDL21
Yong Zhao, Yuanzhao Zhai, Jie Luo, Dawei Feng, Bo Ding, Zhen Li:
Accelerating Robot Reinforcement Learning with Samples of Different Simulation Precision. HPCC/DSS/SmartCity/DependSys 2021: 423-430

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.