
Search dblp
Full-text search
- > Home
Please enter a search query
- case-insensitive prefix search: default
e.g., sig matches "SIGIR" as well as "signal" - exact word search: append dollar sign ($) to word
e.g., graph$ matches "graph", but not "graphics" - boolean and: separate words by space
e.g., codd model - boolean or: connect words by pipe symbol (|)
e.g., graph|network
Update May 7, 2017: Please note that we had to disable the phrase search operator (.) and the boolean not operator (-) due to technical problems. For the time being, phrase search queries will yield regular prefix search result, and search terms preceded by a minus will be interpreted as regular (positive) search terms.
Author search results
no matches
Venue search results
no matches
Refine list
refine by author
- no options
- temporarily not available
refine by venue
- no options
- temporarily not available
refine by type
- no options
- temporarily not available
refine by year
- no options
- temporarily not available
Publication search results
found 85 matches
- 2021
- Oren Neumann, Claudius Gros:
Investment vs. reward in a competitive knapsack problem. CoRR abs/2101.10964 (2021) - 2020
- Ricardo Grunitzki, Gabriel de Oliveira Ramos:
On the Role of Reward Functions for Reinforcement Learning in the Traffic Assignment Problem. IJCNN 2020: 1-9 - Hassam Ullah Sheikh, Ladislau Bölöni:
Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward. IJCNN 2020: 1-8 - Erick Lara-Cárdenas, Arturo Silva-Gálvez, José Carlos Ortiz-Bayliss, Iván Amaya, Jorge M. Cruz-Duarte, Hugo Terashima-Marín:
Exploring Reward-based Hyper-heuristics for the Job-shop Scheduling Problem. SSCI 2020: 3133-3140 - Bradly C. Stadie, Lunjun Zhang, Jimmy Ba:
Learning Intrinsic Rewards as a Bi-Level Optimization Problem. UAI 2020: 111-120 - Hassam Ullah Sheikh, Ladislau Bölöni:
Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward. CoRR abs/2003.10598 (2020) - Ran Tian, Liting Sun, Masayoshi Tomizuka:
Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem. CoRR abs/2009.01495 (2020) - 2019
- Didier Sornette, Spencer Wheatley, Peter Cauwels:
The Fair Reward Problem: the illusion of Success and How to solve IT. Adv. Complex Syst. 22(3): 1950005 (2019) - Miguel Martín, Antonio Jiménez-Martín
, Alfonso Mateos
:
A numerical analysis of allocation strategies for the multi-armed bandit problem under delayed rewards conditions in digital campaign management. Neurocomputing 363: 99-113 (2019) - Zheming Zhang, Will N. Browne, Dale A. Carnegie:
XCS with Combined Reward Method (XCSCR) for Policy Search in Multistep Problems. CEC 2019: 2982-2989 - Jia-Chi Chen, Tao-Hsing Chang:
Modified PPO-RND Method for Solving Sparse Reward Problem in ViZDoom. CoG 2019: 1-4 - Miguel Martín, Antonio Jiménez-Martín, Alfonso Mateos:
The Multi-Armed Bandit Problem under Delayed Rewards Conditions in Digital Campaign Management. CoDIT 2019: 952-957 - Francesco Bianchi, Francesco Lo Presti:
A Comparison of Markov Reward Based Resource-Latency Aware Heuristics for the Virtual Network Embedding Problem. Systems Modeling: Methodologies and Tools 2019: 135-149 - Tom Everitt, Marcus Hutter:
Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective. CoRR abs/1908.04734 (2019) - Dattaraj Rao:
Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem. CoRR abs/1909.07116 (2019) - Guillaume Matheron, Nicolas Perrin, Olivier Sigaud:
The problem with DDPG: understanding failures in deterministic environments with sparse rewards. CoRR abs/1911.11679 (2019) - 2018
- Meltem Sevgi:
Mechanistic models of reward based learning and decision making for clinically motivated problems. Technische Universität Ilmenau, Germany, 2018 - Chuen-Tsai Sun, Li-Xian Chen, Hsiu-Mei Chu:
Associations among scaffold presentation, reward mechanisms and problem-solving behaviors in game play. Comput. Educ. 119: 95-111 (2018) - Miguel Martín, Antonio Jiménez-Martín
, Alfonso Mateos
:
Possibilistic reward methods for the multi-armed bandit problem. Neurocomputing 310: 201-212 (2018) - Zohar M. A. Strinka
, H. Edwin Romeijn
:
Approximation Algorithms for a Class of Stochastic Selection Problems with Reward and Cost Considerations. Oper. Res. 66(3): 834-848 (2018) - Ricardo Grunitzki, Bruno Castro da Silva, Ana L. C. Bazzan:
Towards Designing Optimal Reward Functions in Multi-Agent Reinforcement Learning Problems. IJCNN 2018: 1-8 - Thiago Henrique Freire de Oliveira, Adrião Duarte Dória Neto, Jorge Dantas de Melo:
Q-Learning with Dynamic Rewards Table Applied to the SONET/SDH Ring Problem. IJCNN 2018: 1-6 - Pankaj Trivedi, Arvind Singh:
Stochastic Multi-path Routing Problem with Non-stationary Rewards: Building PayU's Dynamic Routing. WWW (Companion Volume) 2018: 1707-1712 - Makoto Naruse, Nicolas Chauvet, David Jegouso, Benoit Boulanger, Hayato Saigo, Kazuya Okamura, Hirokazu Hori, Aurélien Drezet, Serge Huant, Guillaume Bachelier:
Entangled photons for competitive multi-armed bandit problem: achievement of maximum social reward, equality, and deception prevention. CoRR abs/1804.04316 (2018) - 2017
- Francesco Bianchi, Francesco Lo Presti
:
A Markov Reward based Resource-Latency Aware Heuristic for the Virtual Network Embedding Problem. SIGMETRICS Perform. Evaluation Rev. 44(4): 57-68 (2017) - Rayna Dimitrova, Ivan Gavran, Rupak Majumdar, Vinayak S. Prabhu, Sadegh Esmaeil Zadeh Soudjani
:
The Robot Routing Problem for Collecting Aggregate Stochastic Rewards. CONCUR 2017: 13:1-13:17 - Xiaoli Chen, Weijun Xu:
A Risk-Reward Model for On-line Financial Leasing Problem with an Interest Rate. FAW 2017: 68-78 - Miguel Martín, Antonio Jiménez-Martín, Alfonso Mateos:
The Possibilistic Reward Method and a Dynamic Extension for the Multi-armed Bandit Problem: A Numerical Study. ICORES 2017: 75-84 - Rayna Dimitrova, Ivan Gavran, Rupak Majumdar, Vinayak S. Prabhu, Sadegh Esmaeil Zadeh Soudjani:
The Robot Routing Problem for Collecting Aggregate Stochastic Rewards. CoRR abs/1704.05303 (2017) - Matej Vecerík, Todd Hester, Jonathan Scholz, Fumin Wang, Olivier Pietquin, Bilal Piot, Nicolas Heess, Thomas Rothörl, Thomas Lampe, Martin A. Riedmiller:
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards. CoRR abs/1707.08817 (2017)
skipping 55 more matches
loading more results
failed to load more results, please try again later

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
load content from web.archive.org
Privacy notice: By enabling the option above, your browser will contact the API of web.archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
Tweets on dblp homepage
Show tweets from on the dblp homepage.
Privacy notice: By enabling the option above, your browser will contact twitter.com and twimg.com to load tweets curated by our Twitter account. At the same time, Twitter will persistently store several cookies with your web browser. While we did signal Twitter to not track our users by setting the "dnt" flag, we do not have any control over how Twitter uses your data. So please proceed with care and consider checking the Twitter privacy policy.
retrieved on 2021-03-05 12:45 CET from data curated by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint