BibTeX record journals/corr/abs-2312-14436

Name: dblp XML data dump
Creator: Schloss Dagstuhl - Leibniz Center for Informatics
Published: 1993
License: https://creativecommons.org/publicdomain/zero/1.0/
Keywords: dblp, XML, computer science, scholarly publications, metadata

> Home > journals/corr/abs-2312-14436

download as .bib file

@article{DBLP:journals/corr/abs-2312-14436,
  author       = {Souradip Chakraborty and
                  Amisha Bhaskar and
                  Anukriti Singh and
                  Pratap Tokekar and
                  Dinesh Manocha and
                  Amrit Singh Bedi},
  title        = {{REBEL:} {A} Regularization-Based Solution for Reward Overoptimization
                  in Reinforcement Learning from Human Feedback},
  journal      = {CoRR},
  volume       = {abs/2312.14436},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2312.14436},
  doi          = {10.48550/ARXIV.2312.14436},
  eprinttype    = {arXiv},
  eprint       = {2312.14436},
  timestamp    = {Wed, 17 Jan 2024 17:14:38 +0100},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2312-14436.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

a service of

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.