Остановите войну!
for scientists:
default search action
Search dblp
Full-text search
- > Home
Please enter a search query
- case-insensitive prefix search: default
e.g., sig matches "SIGIR" as well as "signal" - exact word search: append dollar sign ($) to word
e.g., graph$ matches "graph", but not "graphics" - boolean and: separate words by space
e.g., codd model - boolean or: connect words by pipe symbol (|)
e.g., graph|network
Update May 7, 2017: Please note that we had to disable the phrase search operator (.) and the boolean not operator (-) due to technical problems. For the time being, phrase search queries will yield regular prefix search result, and search terms preceded by a minus will be interpreted as regular (positive) search terms.
Author search results
no matches
Venue search results
no matches
Refine list
refine by author
- no options
- temporarily not available
refine by venue
- no options
- temporarily not available
refine by type
- no options
- temporarily not available
refine by access
- no options
- temporarily not available
refine by year
- no options
- temporarily not available
Publication search results
found 112 matches
- 2024
- Baturay Saglam, Furkan B. Mutlu, Dogan C. Cicek, Suleyman Serdar Kozat:
Parameter-Free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients. Neural Process. Lett. 56(2): 80 (2024) - Washim Uddin Mondal, Vaneet Aggarwal:
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes. AISTATS 2024: 3097-3105 - Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal:
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm. CoRR abs/2402.02042 (2024) - Yudong Luo, Yangchen Pan, Han Wang, Philip H. S. Torr, Pascal Poupart:
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization. CoRR abs/2403.11062 (2024) - 2023
- Andrea Visentin, Steven D. Prestwich, Roberto Rossi, S. Armagan Tarim:
Stochastic dynamic programming heuristic for the (R,s,S) policy parameters computation. Comput. Oper. Res. 158: 106289 (2023) - Chun Shen, Sheng Zhu, Shuai Han, Xiaoyu Gong, Shuai Lü:
Guided deterministic policy optimization with gradient-free policy parameters information. Expert Syst. Appl. 231: 120693 (2023) - Sajjad Pakkhesal, Saeed Shamaghdari:
SOS-based policy iteration for H∞ control of polynomial systems with uncertain parameters. Int. J. Control 96(4): 1053-1066 (2023) - Mahmoud Mohammad Rezapour Tabari, Reihaneh Safari:
Development of water re-allocation policy under uncertainty conditions in the inflow to reservoir and demands parameters: a case study of Karaj AmirKabir dam. Soft Comput. 27(10): 6521-6547 (2023) - Sarath Pattathil, Kaiqing Zhang, Asuman E. Ozdaglar:
Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence. AISTATS 2023: 5641-5685 - Haotian Fu, Jiayu Yao, Omer Gottesman, Finale Doshi-Velez, George Konidaris:
Performance Bounds for Model and Policy Transfer in Hidden-parameter MDPs. ICLR 2023 - Carlo Alfano, Rui Yuan, Patrick Rebeschini:
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence. NeurIPS 2023 - Deyue Li:
Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters. CoRR abs/2303.16548 (2023) - Jiacai Liu, Jinchi Chen, Ke Wei:
On the Linear Convergence of Policy Gradient under Hadamard Parameterization. CoRR abs/2305.19575 (2023) - So Kuroki, Jiaxian Guo, Tatsuya Matsushima, Takuya Okubo, Masato Kobayashi, Yuya Ikeda, Ryosuke Takanami, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa:
GenORM: Generalizable One-shot Rope Manipulation with Parameter-Aware Policy. CoRR abs/2306.09872 (2023) - So Kuroki, Jiaxian Guo, Tatsuya Matsushima, Takuya Okubo, Masato Kobayashi, Yuya Ikeda, Ryosuke Takanami, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa:
GenDOM: Generalizable One-shot Deformable Object Manipulation with Parameter-Aware Policy. CoRR abs/2309.09051 (2023) - Lang Feng, Dong Xing, Junru Zhang, Gang Pan:
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility. CoRR abs/2310.05053 (2023) - Washim Uddin Mondal, Vaneet Aggarwal:
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes. CoRR abs/2310.11677 (2023) - 2022
- Beatriz Cabrero Daniel:
Automating crowd simulation: from parameter tuning to dynamic context-to-policy adaptation. Pompeu Fabra University, Spain, 2022 - Wayne W. Wakeland, Jack B. Homer:
Addressing Parameter Uncertainty in a Health Policy Simulation Model Using Monte Carlo Sensitivity Methods. Syst. 10(6): 225 (2022) - Kaichen Zhou, Lanqing Hong, Shoukang Hu, Fengwei Zhou, Binxin Ru, Jiashi Feng, Zhenguo Li:
DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture. Trans. Mach. Learn. Res. 2022 (2022) - Juan Sepúlveda, Alejandro Angulo, Fernando Mancilla-David, Alexandre Street:
Robust Co-Optimization of Droop and Affine Policy Parameters in Active Distribution Systems With High Penetration of Photovoltaic Generation. IEEE Trans. Smart Grid 13(6): 4355-4366 (2022) - Katherine A. Flanigan, Jerome P. Lynch:
Optimal Event-Based Policy for Remote Parameter Estimation in Wireless Sensing Architectures Under Resource Constraints. IEEE Trans. Wirel. Commun. 21(7): 5293-5304 (2022) - Wesley A. Suttle, Alec Koppel, Ji Liu:
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search. CoRR abs/2201.08832 (2022) - Stephen James, Pieter Abbeel:
Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning. CoRR abs/2202.03957 (2022) - Trang H. Tran, Lam M. Nguyen, Katya Scheinberg:
Finding Optimal Policy for Queueing Models: New Parameterization. CoRR abs/2206.10073 (2022) - Sarath Pattathil, Kaiqing Zhang, Asuman E. Ozdaglar:
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence. CoRR abs/2210.12812 (2022) - 2021
- Firoozeh Haghighi, Bruno Castanier, Hasan Misaii:
Rolling horizon optimal maintenance policy for a system subject to shocks and degradation under uncertain parameters. Comput. Ind. Eng. 157: 107298 (2021) - Benjamin Gravell, Karthik Ganapathy, Tyler H. Summers:
Policy Iteration for Linear Quadratic Games With Stochastic Parameters. IEEE Control. Syst. Lett. 5(1): 307-312 (2021) - Andrea Visentin, Steve Prestwich, Roberto Rossi, S. Armagan Tarim:
Computing optimal (R, s, S) policy parameters by a hybrid of branch-and-bound and stochastic dynamic programming. Eur. J. Oper. Res. 294(1): 91-99 (2021) - Shilei Li, Meng Li, Jiongming Su, Shaofei Chen, Zhimin Yuan, Qing Ye:
PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning. ACM Trans. Intell. Syst. Technol. 12(3): 35:1-35:21 (2021)
skipping 82 more matches
loading more results
failed to load more results, please try again later
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
retrieved on 2024-05-12 12:11 CEST from data curated by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint