Stop the war!

Остановите войну!

for scientists:

default search action

combined dblp search
author search
venue search
publication search

ask others

Search dblp

Name: dblp XML data dump
Creator: Schloss Dagstuhl - Leibniz Center for Informatics
Published: 1993
License: https://creativecommons.org/publicdomain/zero/1.0/
Keywords: dblp, XML, computer science, scholarly publications, metadata

> Home

Publication search results

found 112 matches

2024
- view
  authority control:
- export record
  dblp key:
  - journals/npl/SaglamMCK24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/npl/SaglamMCK24
Baturay Saglam, Furkan B. Mutlu, Dogan C. Cicek, Suleyman Serdar Kozat:
Parameter-Free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients. Neural Process. Lett. 56(2): 80 (2024)
- view
  - electronic edition @ mlr.press
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/MondalA24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/MondalA24
Washim Uddin Mondal, Vaneet Aggarwal:
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes. AISTATS 2024: 3097-3105
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-02042
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-02042
Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal:
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm. CoRR abs/2402.02042 (2024)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-11062
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-11062
Yudong Luo, Yangchen Pan, Han Wang, Philip H. S. Torr, Pascal Poupart:
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization. CoRR abs/2403.11062 (2024)
2023
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/cor/VisentinPRT23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/cor/VisentinPRT23
Andrea Visentin, Steven D. Prestwich, Roberto Rossi, S. Armagan Tarim:
Stochastic dynamic programming heuristic for the (R,s,S) policy parameters computation. Comput. Oper. Res. 158: 106289 (2023)
- view
  authority control:
- export record
  dblp key:
  - journals/eswa/ShenZHGL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/eswa/ShenZHGL23
Chun Shen, Sheng Zhu, Shuai Han, Xiaoyu Gong, Shuai Lü:
Guided deterministic policy optimization with gradient-free policy parameters information. Expert Syst. Appl. 231: 120693 (2023)
- view
  authority control:
- export record
  dblp key:
  - journals/ijcon/PakkhesalS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcon/PakkhesalS23
Sajjad Pakkhesal, Saeed Shamaghdari:
SOS-based policy iteration for H_∞ control of polynomial systems with uncertain parameters. Int. J. Control 96(4): 1053-1066 (2023)
- view
  authority control:
- export record
  dblp key:
  - journals/soco/TabariS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/soco/TabariS23
Mahmoud Mohammad Rezapour Tabari, Reihaneh Safari:
Development of water re-allocation policy under uncertainty conditions in the inflow to reservoir and demands parameters: a case study of Karaj AmirKabir dam. Soft Comput. 27(10): 6521-6547 (2023)
- view
  - electronic edition @ mlr.press (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/aistats/PattathilZO23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aistats/PattathilZO23
Sarath Pattathil, Kaiqing Zhang, Asuman E. Ozdaglar:
Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence. AISTATS 2023: 5641-5685
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/FuYGD023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/FuYGD023
Haotian Fu, Jiayu Yao, Omer Gottesman, Finale Doshi-Velez, George Konidaris:
Performance Bounds for Model and Policy Transfer in Hidden-parameter MDPs. ICLR 2023
- view
  - electronic edition @ nips.cc (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/nips/AlfanoYR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/AlfanoYR23
Carlo Alfano, Rui Yuan, Patrick Rebeschini:
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence. NeurIPS 2023
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-16548
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-16548
Deyue Li:
Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters. CoRR abs/2303.16548 (2023)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-19575
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-19575
Jiacai Liu, Jinchi Chen, Ke Wei:
On the Linear Convergence of Policy Gradient under Hadamard Parameterization. CoRR abs/2305.19575 (2023)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-09872
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-09872
So Kuroki, Jiaxian Guo, Tatsuya Matsushima, Takuya Okubo, Masato Kobayashi, Yuya Ikeda, Ryosuke Takanami, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa:
GenORM: Generalizable One-shot Rope Manipulation with Parameter-Aware Policy. CoRR abs/2306.09872 (2023)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09051
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09051
So Kuroki, Jiaxian Guo, Tatsuya Matsushima, Takuya Okubo, Masato Kobayashi, Yuya Ikeda, Ryosuke Takanami, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa:
GenDOM: Generalizable One-shot Deformable Object Manipulation with Parameter-Aware Policy. CoRR abs/2309.09051 (2023)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-05053
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-05053
Lang Feng, Dong Xing, Junru Zhang, Gang Pan:
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility. CoRR abs/2310.05053 (2023)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-11677
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-11677
Washim Uddin Mondal, Vaneet Aggarwal:
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes. CoRR abs/2310.11677 (2023)
2022
- view
  - electronic edition via handle.net
  - no references & citations available
- export record
  dblp key:
  - phd/es/Daniel22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/es/Daniel22
Beatriz Cabrero Daniel:
Automating crowd simulation: from parameter tuning to dynamic context-to-policy adaptation. Pompeu Fabra University, Spain, 2022
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/systems/WakelandH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/systems/WakelandH22
Wayne W. Wakeland, Jack B. Homer:
Addressing Parameter Uncertainty in a Health Policy Simulation Model Using Monte Carlo Sensitivity Methods. Syst. 10(6): 225 (2022)
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - journals/tmlr/ZhouHHZRFL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmlr/ZhouHHZRFL22
Kaichen Zhou, Lanqing Hong, Shoukang Hu, Fengwei Zhou, Binxin Ru, Jiashi Feng, Zhenguo Li:
DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture. Trans. Mach. Learn. Res. 2022 (2022)
- view
  authority control:
- export record
  dblp key:
  - journals/tsg/SepulvedaAMS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tsg/SepulvedaAMS22
Juan Sepúlveda, Alejandro Angulo, Fernando Mancilla-David, Alexandre Street:
Robust Co-Optimization of Droop and Affine Policy Parameters in Active Distribution Systems With High Penetration of Photovoltaic Generation. IEEE Trans. Smart Grid 13(6): 4355-4366 (2022)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/twc/FlaniganL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/twc/FlaniganL22
Katherine A. Flanigan, Jerome P. Lynch:
Optimal Event-Based Policy for Remote Parameter Estimation in Wireless Sensing Architectures Under Resource Constraints. IEEE Trans. Wirel. Commun. 21(7): 5293-5304 (2022)
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2201-08832
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-08832
Wesley A. Suttle, Alec Koppel, Ji Liu:
Occupancy Information Ratio: Infinite-Horizon, Information-Directed, Parameterized Policy Search. CoRR abs/2201.08832 (2022)
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2202-03957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-03957
Stephen James, Pieter Abbeel:
Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning. CoRR abs/2202.03957 (2022)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-10073
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-10073
Trang H. Tran, Lam M. Nguyen, Katya Scheinberg:
Finding Optimal Policy for Queueing Models: New Parameterization. CoRR abs/2206.10073 (2022)
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-12812
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-12812
Sarath Pattathil, Kaiqing Zhang, Asuman E. Ozdaglar:
Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence. CoRR abs/2210.12812 (2022)
2021
- view
  authority control:
- export record
  dblp key:
  - journals/candie/HaghighiCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/candie/HaghighiCM21
Firoozeh Haghighi, Bruno Castanier, Hasan Misaii:
Rolling horizon optimal maintenance policy for a system subject to shocks and degradation under uncertain parameters. Comput. Ind. Eng. 157: 107298 (2021)
- view
  authority control:
- export record
  dblp key:
  - journals/csysl/GravellGS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csysl/GravellGS21
Benjamin Gravell, Karthik Ganapathy, Tyler H. Summers:
Policy Iteration for Linear Quadratic Games With Stochastic Parameters. IEEE Control. Syst. Lett. 5(1): 307-312 (2021)
- view
  authority control:
- export record
  dblp key:
  - journals/eor/VisentinPRT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/eor/VisentinPRT21
Andrea Visentin, Steve Prestwich, Roberto Rossi, S. Armagan Tarim:
Computing optimal (R, s, S) policy parameters by a hybrid of branch-and-bound and stochastic dynamic programming. Eur. J. Oper. Res. 294(1): 91-99 (2021)
- view
  authority control:
- export record
  dblp key:
  - journals/tist/LiLSCYY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tist/LiLSCYY21
Shilei Li, Meng Li, Jiongming Su, Shaofei Chen, Zhimin Yuan, Qing Ye:
PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning. ACM Trans. Intell. Syst. Technol. 12(3): 35:1-35:21 (2021)

skipping 82 more matches

a service of

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.

Search dblp

Full-text search

Please enter a search query

Author search results

Venue search results

Refine list

Publication search results