![]() | ![]() |
| 2012 | ||
|---|---|---|
| 66 | Emilie Kaufmann, Nathaniel Korda, Rémi Munos: Thompson Sampling: An Optimal Finite Time Analysis CoRR abs/1205.4217: (2012) | |
| 65 | Lucian Busoniu, Rémi Munos: Optimistic planning for Markov decision processes. Journal of Machine Learning Research - Proceedings Track 22: 182-189 (2012) | |
| 64 | Alexandra Carpentier, Rémi Munos: Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit. Journal of Machine Learning Research - Proceedings Track 22: 190-198 (2012) | |
| 2011 | ||
| 63 | Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer: Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits. ALT 2011: 189-203 | |
| 62 | Matthew W. Hoffman, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization. EWRL 2011: 102-114 | |
| 61 | Mohammad Ghavamzadeh, Alessandro Lazaric, Rémi Munos, Matthew W. Hoffman: Finite-Sample Analysis of Lasso-TD. ICML 2011: 1177-1184 | |
| 60 | Alexandra Carpentier, Rémi Munos: Finite Time Analysis of Stratified Sampling for Monte Carlo. NIPS 2011: 1278-1286 | |
| 59 | Alexandra Carpentier, Odalric-Ambrym Maillard, Rémi Munos: Sparse Recovery with Brownian Sensing. NIPS 2011: 1782-1790 | |
| 58 | Mohammad Gheshlaghi Azar, Rémi Munos, Mohammad Ghavamzadeh, Hilbert J. Kappen: Speedy Q-Learning. NIPS 2011: 2411-2419 | |
| 57 | Odalric-Ambrym Maillard, Rémi Munos, Daniil Ryabko: Selecting the State-Representation in Reinforcement Learning. NIPS 2011: 2627-2635 | |
| 56 | Rémi Munos: Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness. NIPS 2011: 783-791 | |
| 55 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits. Journal of Machine Learning Research 12: 1655-1695 (2011) | |
| 54 | Odalric-Ambrym Maillard, Rémi Munos: Adaptive Bandits: Towards the best history-dependent strategy. Journal of Machine Learning Research - Proceedings Track 15: 570-578 (2011) | |
| 53 | Odalric-Ambrym Maillard, Rémi Munos, Gilles Stoltz: A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences. Journal of Machine Learning Research - Proceedings Track 19: 497-514 (2011) | |
| 52 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure exploration in finitely-armed and continuous-armed bandits. Theor. Comput. Sci. 412(19): 1832-1852 (2011) | |
| 2010 | ||
| 51 | Jean-Yves Audibert, Sébastien Bubeck, Rémi Munos: Best Arm Identification in Multi-Armed Bandits. COLT 2010: 41-53 | |
| 50 | Sébastien Bubeck, Rémi Munos: Open Loop Optimistic Planning. COLT 2010: 477-489 | |
| 49 | Odalric-Ambrym Maillard, Rémi Munos: Online Learning in Adversarial Lipschitz Environments. ECML/PKDD (2) 2010: 305-320 | |
| 48 | Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Analysis of a Classification-based Policy Iteration Algorithm. ICML 2010: 607-614 | |
| 47 | Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos: Finite-Sample Analysis of LSTD. ICML 2010: 615-622 | |
| 46 | Odalric-Ambrym Maillard, Rémi Munos: Scrambled Objects for Least-Squares Regression. NIPS 2010: 1549-1557 | |
| 45 | Amir Massoud Farahmand, Rémi Munos, Csaba Szepesvári: Error Propagation for Approximate Policy and Value Iteration. NIPS 2010: 568-576 | |
| 44 | Mohammad Ghavamzadeh, Alessandro Lazaric, Odalric-Ambrym Maillard, Rémi Munos: LSTD with Random Projections. NIPS 2010: 721-729 | |
| 43 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: X-Armed Bandits CoRR abs/1001.4475: (2010) | |
| 42 | Odalric-Ambrym Maillard, Rémi Munos, Alessandro Lazaric, Mohammad Ghavamzadeh: Finite-sample Analysis of Bellman Residual Minimization. Journal of Machine Learning Research - Proceedings Track 13: 299-314 (2010) | |
| 2009 | ||
| 41 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure Exploration in Multi-armed Bandits Problems. ALT 2009: 23-37 | |
| 40 | Alessandro Lazaric, Rémi Munos: Hybrid Stochastic-Adversarial On-line Learning. COLT 2009 | |
| 39 | Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári: Workshop summary: On-line learning with limited feedback. ICML 2009: 168 | |
| 38 | Odalric-Ambrym Maillard, Rémi Munos: Compressed Least-Squares Regression. NIPS 2009: 1213-1221 | |
| 37 | Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos: Sensitivity analysis in HMMs with application to likelihood maximization. NIPS 2009: 387-395 | |
| 36 | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19): 1876-1902 (2009) | |
| 2008 | ||
| 35 | Sertan Girgin, Manuel Loth, Rémi Munos, Philippe Preux, Daniil Ryabko: Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL 2008, Villeneuve d'Ascq, France, June 30 - July 3, 2008, Revised and Selected Papers Springer 2008 | |
| 34 | Raphaël Maîtrepierre, Jérémie Mary, Rémi Munos: Adaptive play in Texas Hold'em Poker. ECAI 2008: 458-462 | |
| 33 | Jean-François Hren, Rémi Munos: Optimistic Planning of Deterministic Systems. EWRL 2008: 151-164 | |
| 32 | Yizao Wang, Jean-Yves Audibert, Rémi Munos: Algorithms for Infinitely Many-Armed Bandits. NIPS 2008: 1729-1736 | |
| 31 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári: Online Optimization in X-Armed Bandits. NIPS 2008: 201-208 | |
| 30 | Pierre-Arnaud Coquelin, Romain Deguest, Rémi Munos: Particle Filter-based Policy Gradient in POMDPs. NIPS 2008: 337-344 | |
| 29 | Sébastien Bubeck, Rémi Munos, Gilles Stoltz: Pure Exploration for Multi-Armed Bandit Problems CoRR abs/0802.2655: (2008) | |
| 28 | Rémi Munos, Csaba Szepesvári: Finite-Time Bounds for Fitted Value Iteration. Journal of Machine Learning Research 9: 815-857 (2008) | |
| 27 | András Antos, Csaba Szepesvári, Rémi Munos: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1): 89-129 (2008) | |
| 2007 | ||
| 26 | Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári: Tuning Bandit Algorithms in Stochastic Environments. ALT 2007: 150-165 | |
| 25 | András Antos, Rémi Munos, Csaba Szepesvári: Fitted Q-iteration in continuous action-space MDPs. NIPS 2007 | |
| 24 | Pierre-Arnaud Coquelin, Rémi Munos: Bandit Algorithms for Tree Search. UAI 2007: 67-74 | |
| 23 | Pierre-Arnaud Coquelin, Rémi Munos: Bandit Algorithms for Tree Search CoRR abs/cs/0703062: (2007) | |
| 22 | Rémi Munos: Analyse en norme Lp de l'algorithme d'itérations sur les valeurs avec approximations. Revue d'Intelligence Artificielle 21(1): 53-74 (2007) | |
| 21 | Rémi Munos: Performance Bounds in Lp-norm for Approximate Value Iteration. SIAM J. Control and Optimization 46(2): 541-561 (2007) | |
| 2006 | ||
| 20 | András Antos, Csaba Szepesvári, Rémi Munos: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path. COLT 2006: 574-588 | |
| 19 | Rémi Munos: Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation. Journal of Machine Learning Research 7: 413-427 (2006) | |
| 18 | Rémi Munos: Policy Gradient in Continuous Time. Journal of Machine Learning Research 7: 771-791 (2006) | |
| 2005 | ||
| 17 | Rémi Munos: Error Bounds for Approximate Value Iteration. AAAI 2005: 1006-1011 | |
| 16 | Rémi Munos: Geometric Variance Reduction in Markov Chains. Application to Value Function and Gradient Estimation. AAAI 2005: 1012-1017 | |
| 15 | Rémi Munos: Policy gradient in continuous time. CAP 2005: 201-216 | |
| 14 | Csaba Szepesvári, Rémi Munos: Finite time bounds for sampling based fitted value iteration. ICML 2005: 880-887 | |
| 13 | Emmanuel Gobet, Rémi Munos: Sensitivity Analysis Using It[o-circumflex]--Malliavin Calculus and Martingales, and Application to Stochastic Optimal Control. SIAM J. Control and Optimization 43(5): 1676-1713 (2005) | |
| 2003 | ||
| 12 | Rémi Munos: Error Bounds for Approximate Policy Iteration. ICML 2003: 560-567 | |
| 2002 | ||
| 11 | Rémi Munos, Andrew W. Moore: Variable Resolution Discretization in Optimal Control. Machine Learning 49(2-3): 291-323 (2002) | |
| 2001 | ||
| 10 | Rémi Munos: Efficient Resources Allocation for Markov Decision Processes. NIPS 2001: 1571-1578 | |
| 2000 | ||
| 9 | Rémi Munos, Andrew W. Moore: Rates of Convergence for Variable Resolution Schemes in Optimal Control. ICML 2000: 647-654 | |
| 8 | Rémi Munos: A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions. Machine Learning 40(3): 265-299 (2000) | |
| 1999 | ||
| 7 | Rémi Munos, Andrew W. Moore: Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems. IJCAI 1999: 1348-1355 | |
| 1998 | ||
| 6 | Rémi Munos: A General Convergence Method for Reinforcement Learning in the Continuous Case. ECML 1998: 394-405 | |
| 5 | Rémi Munos, Andrew W. Moore: Barycentric Interpolators for Continuous Space and Time Reinforcement Learning. NIPS 1998: 1024-1030 | |
| 1997 | ||
| 4 | Rémi Munos: Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems. ECML 1997: 170-182 | |
| 3 | Rémi Munos: A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method. IJCAI (2) 1997: 826-831 | |
| 2 | Rémi Munos, Paul Bourgine: Reinforcement Learning for Continuous Stochastic Control Problems. NIPS 1997 | |
| 1996 | ||
| 1 | Rémi Munos: A Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning. ICML 1996: 337-345 | |
Colors in the list of coauthors
Last update Sun Jun 3 16:06:10 2012 CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page