Search dblp for Publications

export results for "Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis."

> Home > Search

download as .bib file

@article{DBLP:journals/corr/abs-2406-07455,
  author       = {Qining Zhang and
                  Honghao Wei and
                  Lei Ying},
  title        = {Reinforcement Learning from Human Feedback without Reward Inference:
                  Model-Free Algorithm and Instance-Dependent Analysis},
  journal      = {CoRR},
  volume       = {abs/2406.07455},
  year         = {2024},
  url          = {https://doi.org/10.48550/arXiv.2406.07455},
  doi          = {10.48550/ARXIV.2406.07455},
  eprinttype    = {arXiv},
  eprint       = {2406.07455},
  timestamp    = {Mon, 08 Jul 2024 01:00:00 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2406-07455.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.