export record
dblp key:
journals/corr/abs-2402-06389 share record
persistent URL:
Aven Le Zhou , Yu-Ao Wang , Wei Wu , Kang Zhang : Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example. CoRR abs/2402.06389 (2024 )export record
dblp key:
journals/corr/abs-2402-08114 share record
persistent URL:
William Muldrew , Peter Hayes , Mingtian Zhang , David Barber : Active Preference Learning for Large Language Models. CoRR abs/2402.08114 (2024 )export record
dblp key:
journals/corr/abs-2402-08785 share record
persistent URL:
Jianing Wang , Junda Wu , Yupeng Hou , Yao Liu , Ming Gao , Julian J. McAuley : InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment. CoRR abs/2402.08785 (2024 )export record
dblp key:
journals/corr/abs-2402-08905 share record
persistent URL:
Takeshi Kato : Time preference, wealth and utility inequality: A microeconomic interaction and dynamic macroeconomic model connection approach. CoRR abs/2402.08905 (2024 )export record
dblp key:
journals/corr/abs-2402-08925 share record
persistent URL:
Souradip Chakraborty , Jiahao Qiu , Hui Yuan , Alec Koppel , Furong Huang , Dinesh Manocha , Amrit Singh Bedi , Mengdi Wang : MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences. CoRR abs/2402.08925 (2024 )export record
dblp key:
journals/corr/abs-2402-09764 share record
persistent URL:
Dexun Li , Cong Zhang , Kuicai Dong , Derrick-Goh-Xin Deik , Ruiming Tang , Yong Liu : Aligning Crowd Feedback via Distributional Preference Reward Modeling. CoRR abs/2402.09764 (2024 )export record
dblp key:
journals/corr/abs-2402-10038 share record
persistent URL:
Saeed Khaki , JinJin Li , Lan Ma , Liu Yang , Prathap Ramachandra : RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models. CoRR abs/2402.10038 (2024 )export record
dblp key:
journals/corr/abs-2402-10207 share record
persistent URL:
Rui Yang , Xiaoman Pan , Feng Luo , Shuang Qiu , Han Zhong , Dong Yu , Jianshu Chen : Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. CoRR abs/2402.10207 (2024 )export record
dblp key:
journals/corr/abs-2402-10884 share record
persistent URL:
Shengzhi Li , Rongyu Lin , Shichao Pei : Multi-modal preference alignment remedies regression of visual instruction tuning on language model. CoRR abs/2402.10884 (2024 )export record
dblp key:
journals/corr/abs-2402-11411 share record
persistent URL:
Yiyang Zhou , Chenhang Cui , Rafael Rafailov , Chelsea Finn , Huaxiu Yao : Aligning Modalities in Vision Large Language Models via Preference Fine-tuning. CoRR abs/2402.11411 (2024 )export record
dblp key:
journals/corr/abs-2402-11525 share record
persistent URL:
Nuo Xu , Jun Zhao , Can Zu , Sixian Li , Lu Chen , Zhihao Zhang , Rui Zheng , Shihan Dou , Wenjuan Qin , Tao Gui , Qi Zhang , Xuanjing Huang : Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution. CoRR abs/2402.11525 (2024 )export record
dblp key:
journals/corr/abs-2402-11827 share record
persistent URL:
Chanwoong Yoon , Gangwoo Kim , Byeongguk Jeon , Sungdong Kim , Yohan Jo , Jaewoo Kang : Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search. CoRR abs/2402.11827 (2024 )export record
dblp key:
journals/corr/abs-2402-13211 share record
persistent URL:
Dongjin Kang , Sunghwan Kim , Taeyoon Kwon , Seungjun Moon , Hyunsouk Cho , Youngjae Yu , Dongha Lee , Jinyoung Yeo : Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation. CoRR abs/2402.13211 (2024 )export record
dblp key:
journals/corr/abs-2402-14760 share record
persistent URL:
Chen Jia : Generalizing Reward Modeling for Out-of-Distribution Preference Learning. CoRR abs/2402.14760 (2024 )export record
dblp key:
journals/corr/abs-2402-14979 share record
persistent URL:
Victoria Lin , Eli Ben-Michael , Louis-Philippe Morency : Optimizing Language Models for Human Preferences is a Causal Inference Problem. CoRR abs/2402.14979 (2024 )export record
dblp key:
journals/corr/abs-2403-01857 share record
persistent URL:
Andi Nika , Debmalya Mandal , Parameswaran Kamalaruban , Georgios Tzannetos , Goran Radanovic , Adish Singla : Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences. CoRR abs/2403.01857 (2024 )export record
dblp key:
journals/corr/abs-2403-02431 share record
persistent URL:
Dimitris Papadimitriou , Daniel S. Brown : Bayesian Constraint Inference from User Demonstrations Based on Margin-Respecting Preference Models. CoRR abs/2403.02431 (2024 )export record
dblp key:
journals/corr/abs-2403-02745 share record
persistent URL:
Son The Nguyen , Uma-Naresh Niranjan , Theja Tulabandhula : CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models. CoRR abs/2403.02745 (2024 )export record
dblp key:
journals/corr/abs-2403-04187 share record
persistent URL:
Pouria Mistani , Venkatesh Mysore : Preference optimization of protein language models as a multi-objective binder design paradigm. CoRR abs/2403.04187 (2024 )export record
dblp key:
journals/corr/abs-2403-05534 share record
persistent URL:
Kunal Handa , Yarin Gal , Ellie Pavlick , Noah D. Goodman , Jacob Andreas , Alex Tamkin , Belinda Z. Li : Bayesian Preference Elicitation with Language Models. CoRR abs/2403.05534 (2024 )export record
dblp key:
journals/corr/abs-2403-07691 share record
persistent URL:
Jiwoo Hong , Noah Lee , James Thorne : ORPO: Monolithic Preference Optimization without Reference Model. CoRR abs/2403.07691 (2024 )export record
dblp key:
journals/corr/abs-2403-08246 share record
persistent URL:
Yuting Liu , Yizhou Dang , Yuliang Liang , Qiang Liu , Guibing Guo , Jianzhe Zhao , Xingwei Wang : Towards Unified Modeling for Positive and Negative Preferences in Sign-Aware Recommendation. CoRR abs/2403.08246 (2024 )export record
dblp key:
journals/corr/abs-2403-08635 share record
persistent URL:
Daniele Calandriello , Daniel Guo , Rémi Munos , Mark Rowland , Yunhao Tang , Bernardo Ávila Pires , Pierre Harvey Richemond , Charline Le Lan , Michal Valko , Tianqi Liu , Rishabh Joshi , Zeyu Zheng , Bilal Piot : Human Alignment of Large Language Models through Online Preference Optimisation. CoRR abs/2403.08635 (2024 )export record
dblp key:
journals/corr/abs-2403-08730 share record
persistent URL:
Renjie Pi , Tianyang Han , Wei Xiong , Jipeng Zhang , Runtao Liu , Rui Pan , Tong Zhang : Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization. CoRR abs/2403.08730 (2024 )export record
dblp key:
journals/corr/abs-2403-09032 share record
persistent URL:
Martin Weyssow , Aton Kamanda , Houari A. Sahraoui : CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences. CoRR abs/2403.09032 (2024 )export record
dblp key:
journals/corr/abs-2403-16950 share record
persistent URL:
Yinhong Liu , Han Zhou , Zhijiang Guo , Ehsan Shareghi , Ivan Vulic , Anna Korhonen , Nigel Collier : Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators. CoRR abs/2403.16950 (2024 )export record
dblp key:
journals/corr/abs-2403-17010 share record
persistent URL:
Lingdong Kong , Xiang Xu , Jun Cen , Wenwei Zhang , Liang Pan , Kai Chen , Ziwei Liu : Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding. CoRR abs/2403.17010 (2024 )export record
dblp key:
journals/corr/abs-2403-17374 share record
persistent URL:
Hyunjun Ju , SeongKu Kang , Dongha Lee , Junyoung Hwang , Sanghwan Jang , Hwanjo Yu : Multi-Domain Recommendation to Attract Users via Domain Preference Modeling. CoRR abs/2403.17374 (2024 )export record
dblp key:
journals/corr/abs-2403-18381 share record
persistent URL:
Dongfang Li , Zetian Sun , Baotian Hu , Zhenyu Liu , Xinshuo Hu , Xuebo Liu , Min Zhang : Improving Attributed Text Generation of Large Language Models via Preference Learning. CoRR abs/2403.18381 (2024 )export record
dblp key:
journals/corr/abs-2403-19443 share record
persistent URL:
Qi Gou , Cam-Tu Nguyen : Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model. CoRR abs/2403.19443 (2024 )