 | 2011 |
| 26 |  | Alexandra Carpentier,
Alessandro Lazaric,
Mohammad Ghavamzadeh,
Rémi Munos,
Peter Auer:
Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits.
ALT 2011: 189-203 |
| 25 |  | Matthew W. Hoffman,
Alessandro Lazaric,
Mohammad Ghavamzadeh,
Rémi Munos:
Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization.
EWRL 2011: 102-114 |
| 24 |  | Victor Gabillon,
Alessandro Lazaric,
Mohammad Ghavamzadeh,
Bruno Scherrer:
Classification-based Policy Iteration with a Critic.
ICML 2011: 1049-1056 |
| 23 |  | Mohammad Ghavamzadeh,
Alessandro Lazaric,
Rémi Munos,
Matthew W. Hoffman:
Finite-Sample Analysis of Lasso-TD.
ICML 2011: 1177-1184 |
| 22 |  | Victor Gabillon,
Mohammad Ghavamzadeh,
Alessandro Lazaric,
Sébastien Bubeck:
Multi-Bandit Best Arm Identification.
NIPS 2011: 2222-2230 |
| 21 |  | Mohammad Gheshlaghi Azar,
Rémi Munos,
Mohammad Ghavamzadeh,
Hilbert J. Kappen:
Speedy Q-Learning.
NIPS 2011: 2411-2419 |
| 2010 |
| 20 |  | Alessandro Lazaric,
Mohammad Ghavamzadeh:
Bayesian Multi-Task Reinforcement Learning.
ICML 2010: 599-606 |
| 19 |  | Alessandro Lazaric,
Mohammad Ghavamzadeh,
Rémi Munos:
Analysis of a Classification-based Policy Iteration Algorithm.
ICML 2010: 607-614 |
| 18 |  | Alessandro Lazaric,
Mohammad Ghavamzadeh,
Rémi Munos:
Finite-Sample Analysis of LSTD.
ICML 2010: 615-622 |
| 17 |  | Mohammad Ghavamzadeh,
Alessandro Lazaric,
Odalric-Ambrym Maillard,
Rémi Munos:
LSTD with Random Projections.
NIPS 2010: 721-729 |
| 16 |  | Odalric-Ambrym Maillard,
Rémi Munos,
Alessandro Lazaric,
Mohammad Ghavamzadeh:
Finite-sample Analysis of Bellman Residual Minimization.
Journal of Machine Learning Research - Proceedings Track 13: 299-314 (2010) |
| 2009 |
| 15 |  | Shalabh Bhatnagar,
Richard S. Sutton,
Mohammad Ghavamzadeh,
Mark Lee:
Natural actor-critic algorithms.
Automatica 45(11): 2471-2482 (2009) |
| 2008 |
| 14 |  | Amir Massoud Farahmand,
Mohammad Ghavamzadeh,
Csaba Szepesvári,
Shie Mannor:
Regularized Fitted Q-Iteration: Application to Planning.
EWRL 2008: 55-68 |
| 13 |  | Amir Massoud Farahmand,
Mohammad Ghavamzadeh,
Csaba Szepesvári,
Shie Mannor:
Regularized Policy Iteration.
NIPS 2008: 441-448 |
| 2007 |
| 12 |  | Mohammad Ghavamzadeh,
Yaakov Engel:
Bayesian actor-critic algorithms.
ICML 2007: 297-304 |
| 11 |  | Shalabh Bhatnagar,
Richard S. Sutton,
Mohammad Ghavamzadeh,
Mark Lee:
Incremental Natural Actor-Critic Algorithms.
NIPS 2007 |
| 10 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
Hierarchical Average Reward Reinforcement Learning.
Journal of Machine Learning Research 8: 2629-2669 (2007) |
| 2006 |
| 9 |  | Mohammad Ghavamzadeh,
Yaakov Engel:
Bayesian Policy Gradient Algorithms.
NIPS 2006: 457-464 |
| 8 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan,
Rajbala Makar:
Hierarchical multi-agent reinforcement learning.
Autonomous Agents and Multi-Agent Systems 13(2): 197-229 (2006) |
| 2005 |
| 7 |  | Ion Muslea,
Virginia Dignum,
Daniel D. Corkill,
Catholijn M. Jonker,
Frank Dignum,
Silvia Coradeschi,
Alessandro Saffiotti,
Dan Fu,
Jeff Orkin,
William Cheetham,
Kai Goebel,
Piero P. Bonissone,
Leen-Kiat Soh,
Randolph M. Jones,
Robert E. Wray III,
Matthias Scheutz,
Daniela Pucci de Farias,
Shie Mannor,
Georgios Theocharous,
Doina Precup,
Bamshad Mobasher,
Sarabjot S. Anand,
Bettina Berendt,
Andreas Hotho,
Hans W. Guesgen,
Michael T. Rosenstein,
Mohammad Ghavamzadeh:
The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.
AI Magazine 26(1): 103-108 (2005) |
| 2004 |
| 6 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
Learning to Communicate and Act Using Hierarchical Reinforcement Learning.
AAMAS 2004: 1114-1121 |
| 2003 |
| 5 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
Hierarchical Policy Gradient Algorithms.
ICML 2003: 226-233 |
| 2002 |
| 4 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
A multiagent reinforcement learning algorithm by dynamically merging markov decision processes.
AAMAS 2002: 845-846 |
| 3 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
Hierarchically Optimal Average Reward Reinforcement Learning.
ICML 2002: 195-202 |
| 2001 |
| 2 |  | Rajbala Makar,
Sridhar Mahadevan,
Mohammad Ghavamzadeh:
Hierarchical multi-agent reinforcement learning.
Agents 2001: 246-253 |
| 1 |  | Mohammad Ghavamzadeh,
Sridhar Mahadevan:
Continuous-Time Hierarchical Reinforcement Learning.
ICML 2001: 186-193 |