Name: dblp XML data dump
Creator: Schloss Dagstuhl - Leibniz Center for Informatics
Published: 1993
License: https://creativecommons.org/publicdomain/zero/1.0/
Keywords: dblp, XML, computer science, scholarly publications, metadata

- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-06389
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-06389
Aven Le Zhou, Yu-Ao Wang, Wei Wu, Kang Zhang:
Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example. CoRR abs/2402.06389 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-08114
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-08114
William Muldrew, Peter Hayes, Mingtian Zhang, David Barber:
Active Preference Learning for Large Language Models. CoRR abs/2402.08114 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-08785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-08785
Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, Julian J. McAuley:
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment. CoRR abs/2402.08785 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-08905
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-08905
Takeshi Kato:
Time preference, wealth and utility inequality: A microeconomic interaction and dynamic macroeconomic model connection approach. CoRR abs/2402.08905 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-08925
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-08925
Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Singh Bedi, Mengdi Wang:
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences. CoRR abs/2402.08925 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-09764
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-09764
Dexun Li, Cong Zhang, Kuicai Dong, Derrick-Goh-Xin Deik, Ruiming Tang, Yong Liu:
Aligning Crowd Feedback via Distributional Preference Reward Modeling. CoRR abs/2402.09764 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-10038
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-10038
Saeed Khaki, JinJin Li, Lan Ma, Liu Yang, Prathap Ramachandra:
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models. CoRR abs/2402.10038 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-10207
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-10207
Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen:
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. CoRR abs/2402.10207 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-10884
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-10884
Shengzhi Li, Rongyu Lin, Shichao Pei:
Multi-modal preference alignment remedies regression of visual instruction tuning on language model. CoRR abs/2402.10884 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-11411
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-11411
Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao:
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning. CoRR abs/2402.11411 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-11525
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-11525
Nuo Xu, Jun Zhao, Can Zu, Sixian Li, Lu Chen, Zhihao Zhang, Rui Zheng, Shihan Dou, Wenjuan Qin, Tao Gui, Qi Zhang, Xuanjing Huang:
Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution. CoRR abs/2402.11525 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-11827
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-11827
Chanwoong Yoon, Gangwoo Kim, Byeongguk Jeon, Sungdong Kim, Yohan Jo, Jaewoo Kang:
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search. CoRR abs/2402.11827 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-13211
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-13211
Dongjin Kang, Sunghwan Kim, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, Jinyoung Yeo:
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation. CoRR abs/2402.13211 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-14760
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-14760
Chen Jia:
Generalizing Reward Modeling for Out-of-Distribution Preference Learning. CoRR abs/2402.14760 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-14979
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-14979
Victoria Lin, Eli Ben-Michael, Louis-Philippe Morency:
Optimizing Language Models for Human Preferences is a Causal Inference Problem. CoRR abs/2402.14979 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-01857
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-01857
Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Georgios Tzannetos, Goran Radanovic, Adish Singla:
Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences. CoRR abs/2403.01857 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-02431
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-02431
Dimitris Papadimitriou, Daniel S. Brown:
Bayesian Constraint Inference from User Demonstrations Based on Margin-Respecting Preference Models. CoRR abs/2403.02431 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-02745
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-02745
Son The Nguyen, Uma-Naresh Niranjan, Theja Tulabandhula:
CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models. CoRR abs/2403.02745 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-04187
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-04187
Pouria Mistani, Venkatesh Mysore:
Preference optimization of protein language models as a multi-objective binder design paradigm. CoRR abs/2403.04187 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-05534
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-05534
Kunal Handa, Yarin Gal, Ellie Pavlick, Noah D. Goodman, Jacob Andreas, Alex Tamkin, Belinda Z. Li:
Bayesian Preference Elicitation with Language Models. CoRR abs/2403.05534 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-07691
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-07691
Jiwoo Hong, Noah Lee, James Thorne:
ORPO: Monolithic Preference Optimization without Reference Model. CoRR abs/2403.07691 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08246
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08246
Yuting Liu, Yizhou Dang, Yuliang Liang, Qiang Liu, Guibing Guo, Jianzhe Zhao, Xingwei Wang:
Towards Unified Modeling for Positive and Negative Preferences in Sign-Aware Recommendation. CoRR abs/2403.08246 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08635
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08635
Daniele Calandriello, Daniel Guo, Rémi Munos, Mark Rowland, Yunhao Tang, Bernardo Ávila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot:
Human Alignment of Large Language Models through Online Preference Optimisation. CoRR abs/2403.08635 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-08730
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-08730
Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang:
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization. CoRR abs/2403.08730 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-09032
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-09032
Martin Weyssow, Aton Kamanda, Houari A. Sahraoui:
CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences. CoRR abs/2403.09032 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-16950
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-16950
Yinhong Liu, Han Zhou, Zhijiang Guo, Ehsan Shareghi, Ivan Vulic, Anna Korhonen, Nigel Collier:
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators. CoRR abs/2403.16950 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-17010
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-17010
Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu:
Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding. CoRR abs/2403.17010 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-17374
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-17374
Hyunjun Ju, SeongKu Kang, Dongha Lee, Junyoung Hwang, Sanghwan Jang, Hwanjo Yu:
Multi-Domain Recommendation to Attract Users via Domain Preference Modeling. CoRR abs/2403.17374 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-18381
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-18381
Dongfang Li, Zetian Sun, Baotian Hu, Zhenyu Liu, Xinshuo Hu, Xuebo Liu, Min Zhang:
Improving Attributed Text Generation of Large Language Models via Preference Learning. CoRR abs/2403.18381 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-19443
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-19443
Qi Gou, Cam-Tu Nguyen:
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model. CoRR abs/2403.19443 (2024)