


default search action
ADPRL 2009: Nashville, TN, USA
- IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009, Nashville, TN, USA, March 31 - April 1, 2009. IEEE 2009, ISBN 978-1-4244-2761-1

Keynote Lecture
- Dimitri P. Bertsekas:

A unified framework for temporal difference methods. 1-7
Adaptive Dynamic Programming and Reinforcement Learning
- Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiyama, Jan Peters

:
Efficient data reuse in value function approximation. 8-15 - Lili Cui, Huaguang Zhang, Derong Liu

, Yongsu Kim:
Constrained optimal control of affine nonlinear discrete-time systems using GHJB method. 16-21 - Xuerui Bai, Dongbin Zhao

, Jianqiang Yi:
ADHDP(λ) strategies based coordinated ramps metering with queuing consideration. 22-27
ADP and RL for Controls
- Hongwei Zhang

, Jie Huang, Frank L. Lewis:
Algorithm and stability of ATC receding horizon control. 28-35 - Kyriakos G. Vamvoudakis

, Draguna L. Vrabie
, Frank L. Lewis:
Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem. 36-41 - Dongsung Huh, Emanuel Todorov:

Real-time motor control using recurrent neural networks. 42-49 - Dan Liu, Emanuel Todorov:

Hierarchical optimal control of a 7-DOF arm model. 50-57 - Tom Erez, William D. Smart:

Coupling perception and action using minimax optimal control. 58-65
Markov Decision Processes
- Jun Ma, Warren B. Powell:

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces. 66-73 - Huizhen Yu, Dimitri P. Bertsekas:

Basis function adaptation methods for cost approximation in MDP. 74-81 - Elva Corona-Xelhuantzi, Eduardo F. Morales

, Luis Enrique Sucar
:
Executing concurrent actions with multiple Markov decision processes. 82-89 - Emanuel Todorov, Yuval Tassa:

Iterative local dynamic programming. 90-95 - Eugene A. Feinberg:

Adaptive computation of optimal nonrandomized policies in constrained average-reward MDPs. 96-100
Architecture of ADP and RL
- Marco A. Wiering, Hado van Hasselt:

The QV family compared to other reinforcement learning algorithms. 101-108 - Philippe Preux, Sertan Girgin, Manuel Loth:

Feature discovery in approximate dynamic programming. 109-116 - Raphaël Fonteneau, Susan A. Murphy

, Louis Wehenkel
, Damien Ernst:
Inferring bounds on the performance of a control policy from a sample of trajectories. 117-123 - Xin Zhang, Huaguang Zhang, Derong Liu

, Yongsu Kim:
Neural-network-based reinforcement learning controller for nonlinear systems with non-symmetric dead-zone inputs. 124-129 - Yogesh P. Awate:

Algorithms for variance reduction in a policy-gradient based actor-critic framework. 130-136
Policy Search in ADP and RL
- Ilya O. Ryzhov, Warren B. Powell:

The knowledge gradient algorithm for online subset selection. 137-144 - Boris Defourny

, Damien Ernst, Louis Wehenkel
:
Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces. 145-152 - Lucian Busoniu

, Damien Ernst, Bart De Schutter
, Robert Babuska:
Policy search with cross-entropy optimization of basis functions. 153-160 - Emanuel Todorov:

Eigenfunction approximation methods for linearly-solvable optimal control problems. 161-168 - Jason Pazis, Michail G. Lagoudakis

:
Learning continuous-action control policies. 169-176
Statistical and Multiagent RL
- Harm van Seijen, Hado van Hasselt, Shimon Whiteson, Marco A. Wiering:

A theoretical and empirical analysis of Expected Sarsa. 177-184 - Matthieu Geist, Olivier Pietquin

, Gabriel Fricout:
Kalman Temporal Differences: The deterministic case. 185-192 - Willi Richert, Ulrich Scheller, Markus Koch, Bernd Kleinjohann, Claudius Stern:

Integrating sporadic imitation in Reinforcement Learning robots. 193-198 - Roman V. Belavkin

:
Bounds of optimal learning. 199-204 - Ali Akramizadeh, Mohammad B. Menhaj

, Ahmad Afshar
:
Multiagent reinforcement learning in extensive form games with complete information. 205-211
Applications of ADP and RL
- C. Alexander Simpkins, Emanuel Todorov:

Practical numerical methods for stochastic optimal control of biological systems in continuous time and space. 212-218 - Evangelos A. Theodorou, Jonas Buchli, Stefan Schaal:

Path integral-based stochastic optimal control for rigid body dynamics. 219-225 - Jan Peters

, Jens Kober
:
Using reward-weighted imitation for robot Reinforcement Learning. 226-232 - H. Daniel Patiño, Santiago Tosetti, Flavio Capraro:

Adaptive Critic Designs-based autonomous unmanned vehicles navigation: Application to robotic farm vehicles. 233-237 - Xiaofeng Lin, Tangbo Liu, Shaojian Song, Chunning Song:

Neuro-controller of cement rotary kiln temperature with adaptive critic designs. 238-242

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














