default search action
BibTeX record journals/corr/abs-2310-05199
@article{DBLP:journals/corr/abs-2310-05199, author = {Wei Shen and Rui Zheng and WenYu Zhan and Jun Zhao and Shihan Dou and Tao Gui and Qi Zhang and Xuanjing Huang}, title = {Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback}, journal = {CoRR}, volume = {abs/2310.05199}, year = {2023}, url = {https://doi.org/10.48550/arXiv.2310.05199}, doi = {10.48550/ARXIV.2310.05199}, eprinttype = {arXiv}, eprint = {2310.05199}, timestamp = {Fri, 22 Mar 2024 10:55:02 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-2310-05199.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.