default search action
Matthieu Geist
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j17]Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Thomas Mesnard, Hado van Hasselt, Laura Toni:
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning. Trans. Mach. Learn. Res. 2024 (2024) - 2021
- [j16]Antoine Mahé, Antoine Richard, Stéphanie Aravecchia, Matthieu Geist, Cédric Pradalier:
Evaluation of Prioritized Deep System Identification on a Path Following Task. J. Intell. Robotic Syst. 101(4): 78 (2021) - [j15]Othmane-Latif Ouabi, Pascal Pomarede, Matthieu Geist, Nico F. Declercq, Cédric Pradalier:
A FastSLAM Approach Integrating Beamforming Maps for Ultrasound-Based Robotic Inspection of Metal Structures. IEEE Robotics Autom. Lett. 6(2): 2908-2913 (2021) - [j14]Antoine Richard, Stéphanie Aravecchia, Thomas Schillaci, Matthieu Geist, Cédric Pradalier:
How to Train Your HERON. IEEE Robotics Autom. Lett. 6(3): 5247-5252 (2021) - 2019
- [j13]Daoming Lyu, Bo Liu, Matthieu Geist, Wen Dong, Saad Biaz, Qi Wang:
Stable and Efficient Policy Evaluation. IEEE Trans. Neural Networks Learn. Syst. 30(6): 1831-1840 (2019) - 2017
- [j12]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning. IEEE Trans. Neural Networks Learn. Syst. 28(8): 1814-1826 (2017) - 2015
- [j11]Bruno Scherrer, Mohammad Ghavamzadeh, Victor Gabillon, Boris Lesner, Matthieu Geist:
Approximate modified policy iteration and its application to the game of Tetris. J. Mach. Learn. Res. 16: 1629-1676 (2015) - [j10]Matthieu Geist:
Soft-max boosting. Mach. Learn. 100(2-3): 305-332 (2015) - [j9]Bruno Scherrer, Matthieu Geist:
Recherche locale de politique dans un espace convexe. Rev. d'Intelligence Artif. 29(6): 685-704 (2015) - 2014
- [j8]Matthieu Geist, Bruno Scherrer:
Off-policy learning with eligibility traces: a survey. J. Mach. Learn. Res. 15(1): 289-333 (2014) - 2013
- [j7]Hervé Frezza-Buet, Matthieu Geist:
A C++ template-based reinforcement learning library: fitting the code to the mathematics. J. Mach. Learn. Res. 14(1): 625-628 (2013) - [j6]Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
Classification structurée pour l'apprentissage par renforcement inverse. Rev. d'Intelligence Artif. 27(2): 155-169 (2013) - [j5]Matthieu Geist, Olivier Pietquin:
Algorithmic Survey of Parametric Value Function Approximation. IEEE Trans. Neural Networks Learn. Syst. 24(6): 845-867 (2013) - 2012
- [j4]Lucie Daubigney, Matthieu Geist, Senthilkumar Chandramohan, Olivier Pietquin:
A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization. IEEE J. Sel. Top. Signal Process. 6(8): 891-902 (2012) - 2011
- [j3]Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan, Hervé Frezza-Buet:
Sample-efficient batch reinforcement learning for dialogue management optimization. ACM Trans. Speech Lang. Process. 7(3): 7:1-7:21 (2011) - 2010
- [j2]Matthieu Geist, Olivier Pietquin:
Kalman Temporal Differences. J. Artif. Intell. Res. 39: 483-532 (2010) - [j1]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Différences temporelles de Kalman. Cas déterministe. Rev. d'Intelligence Artif. 24(4): 423-443 (2010)
Conference and Workshop Papers
- 2024
- [c112]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. AAAI 2024: 9616-9625 - [c111]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. AAMAS 2024: 2561-2563 - [c110]Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos Garea, Matthieu Geist, Olivier Bachem:
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes. ICLR 2024 - [c109]Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach:
Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View. ICLR 2024 - [c108]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. ICML 2024 - [c107]Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. ICML 2024 - 2023
- [c106]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos Garea, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. ACL (1) 2023: 6252-6272 - [c105]Divyansh Garg, Joey Hejna, Matthieu Geist, Stefano Ermon:
Extreme Q-Learning: MaxEnt RL without Entropy. ICLR 2023 - [c104]Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov:
A Connection between One-Step RL and Critic Regularization in Reinforcement Learning. ICML 2023: 9485-9507 - [c103]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. ICML 2023: 17135-17175 - [c102]Batuhan Yardim, Semih Cayci, Matthieu Geist, Niao He:
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games. ICML 2023: 39722-39754 - [c101]Othmane-Latif Ouabi, Neil Zeghidour, Nico F. Declercq, Matthieu Geist, Cédric Pradalier:
Pose-graph SLAM Using Multi-order Ultrasonic Echoes and Beamforming for Long-range Inspection Robots. ICRA 2023: 10623-10629 - [c100]Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Y. Levy, Shie Mannor:
Policy Gradient for Rectangular Robust Markov Decision Processes. NeurIPS 2023 - [c99]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. NeurIPS 2023 - [c98]Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Matthieu Geist, Yuejie Chi:
The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model. NeurIPS 2023 - [c97]Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist:
Offline Reinforcement Learning with On-Policy Q-Function Regularization. ECML/PKDD (4) 2023: 455-471 - 2022
- [c96]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-exploration. AAAI 2022: 8106-8114 - [c95]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. AAAI 2022: 9413-9421 - [c94]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-values. AISTATS 2022: 1380-1402 - [c93]Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Müller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux:
A general class of surrogate functions for stable and efficient reinforcement learning. AISTATS 2022: 8619-8649 - [c92]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint. AAMAS 2022: 489-497 - [c91]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable RL by Learning When to Act. AAMAS 2022: 669-677 - [c90]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling Mean Field Games by Online Mirror Descent. AAMAS 2022: 1028-1037 - [c89]Othmane-Latif Ouabi, Jiawei Yi, Neil Zeghidour, Nico F. Declercq, Matthieu Geist, Cédric Pradalier:
Polygonal Shapes Reconstruction from Acoustic Echoes Using a Mobile Sensor and Beamforming. EUSIPCO 2022: 1507-1511 - [c88]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. ICML 2022: 4537-4557 - [c87]Thibault Lahire, Matthieu Geist, Emmanuel Rachelson:
Large Batch Experience Replay. ICML 2022: 11790-11813 - [c86]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Elie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. ICML 2022: 12078-12095 - [c85]Othmane-Latif Ouabi, Ayoub Ridani, Pascal Pomarede, Neil Zeghidour, Nico F. Declercq, Matthieu Geist, Cédric Pradalier:
Combined Grid and Feature-based Mapping of Metal Structures with Ultrasonic Guided Waves. ICRA 2022: 5056-5062 - [c84]Mathieu Blondel, Felipe Llinares-López, Robert Dadashi, Léonard Hussenot, Matthieu Geist:
Learning Energy Networks with Generalized Fenchel-Young Losses. NeurIPS 2022 - 2021
- [c83]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. AAMAS 2021: 501-509 - [c82]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show Me the Way: Intrinsic Motivation from Demonstrations. AAMAS 2021: 620-628 - [c81]Antoine Richard, Stéphanie Aravecchia, Matthieu Geist, Cédric Pradalier:
Learning Behaviors through Physics-driven Latent Imagination. CoRL 2021: 1190-1199 - [c80]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study. ICLR 2021 - [c79]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. ICLR 2021 - [c78]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. ICLR 2021 - [c77]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. ICML 2021: 2307-2318 - [c76]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphaël Marinier, Lukasz Stafiniak, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. ICML 2021: 4511-4522 - [c75]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. IJCAI 2021: 356-362 - [c74]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. NeurIPS 2021: 1898-1911 - [c73]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? NeurIPS 2021: 14656-14668 - [c72]Esther Derman, Matthieu Geist, Shie Mannor:
Twice regularized MDPs and the equivalence between robustness and regularization. NeurIPS 2021: 22274-22287 - 2020
- [c71]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Deep Conservative Policy Iteration. AAAI 2020: 6070-6077 - [c70]Romuald Elie, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Olivier Pietquin:
On the Convergence of Model Free Learning in Mean Field Games. AAAI 2020: 7143-7150 - [c69]Alexis Jacq, Julien Pérolat, Matthieu Geist, Olivier Pietquin:
Foolproof Cooperative Learning. ACML 2020: 401-416 - [c68]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. AISTATS 2020: 2529-2538 - [c67]Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
CopyCAT: : Taking Control of Neural Policies with Constant Attacks. AAMAS 2020: 548-556 - [c66]Erinc Merdivan, Sten Hanke, Matthieu Geist:
Modified Actor-Critics. AAMAS 2020: 1925-1927 - [c65]Assia Benbihi, Stéphanie Arravechia, Matthieu Geist, Cédric Pradalier:
Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description. ICRA 2020: 3032-3038 - [c64]Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin:
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning. IJCAI 2020: 2655-2661 - [c63]Othmane-Latif Ouabi, Pascal Pomarede, Matthieu Geist, Nico F. Declercq, Cédric Pradalier:
Monte-Carlo Localization on Metal Plates Based on Ultrasonic Guided Waves. ISER 2020: 345-353 - [c62]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. NeurIPS 2020 - [c61]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning. NeurIPS 2020 - [c60]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. NeurIPS 2020 - [c59]Antoine Richard, Lior Fine, Offer Rozenstein, Josef Tanny, Matthieu Geist, Cédric Pradalier:
Filling Gaps in Micro-meteorological Data. ECML/PKDD (5) 2020: 101-117 - 2019
- [c58]Anush Manukyan, Miguel A. Olivares-Méndez, Matthieu Geist, Holger Voos:
Deep Reinforcement Learning-based Continuous Control for Multicopter Systems. CoDIT 2019: 1876-1881 - [c57]Antoine Mahé, Antoine Richard, Benjamin Mouscadet, Cédric Pradalier, Matthieu Geist:
Importance Sampling for Deep System Identification. ICAR 2019: 43-48 - [c56]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
ELF: Embedded Localisation of Features in Pre-Trained CNN. ICCV 2019: 7939-7948 - [c55]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. ICML 2019: 2160-2169 - [c54]Alexis Jacq, Matthieu Geist, Ana Paiva, Olivier Pietquin:
Learning from a Learner. ICML 2019: 2990-2999 - [c53]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
Semi-supervised Domain Adaptation with Representation Learning for Semantic Segmentation Across Time. ICONIP (5) 2019: 459-466 - [c52]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
Learning Sensor Placement from Demonstration for UAV networks. ISCC 2019: 1-6 - [c51]Erinc Merdivan, Anastasios Vafeiadis, Dimitrios Kalatzis, Sten Hanke, Joahannes Kroph, Konstantinos Votis, Dimitrios Giakoumis, Dimitrios Tzovaras, Liming Chen, Raouf Hamzaoui, Matthieu Geist:
Image-Based Text Classification using 2D Convolutional Neural Networks. SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI 2019: 144-149 - 2018
- [c50]Ismini Psychoula, Erinc Merdivan, Deepika Singh, Liming Chen, Feng Chen, Sten Hanke, Johannes Kropf, Andreas Holzinger, Matthieu Geist:
A Deep Learning Approach for Privacy Preservation in Assisted Living. PerCom Workshops 2018: 710-715 - 2017
- [c49]Deepika Singh, Erinc Merdivan, Ismini Psychoula, Johannes Kropf, Sten Hanke, Matthieu Geist, Andreas Holzinger:
Human Activity Recognition Using Recurrent Neural Networks. CD-MAKE 2017: 267-274 - [c48]Matthieu Geist, Bilal Piot, Olivier Pietquin:
Is the Bellman residual a bad proxy? NIPS 2017: 3205-3214 - [c47]Erinc Merdivan, Mohammad Reza Loghmani, Matthieu Geist:
Reconstruct & Crush Network. NIPS 2017: 4548-4556 - 2016
- [c46]Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin:
Score-based Inverse Reinforcement Learning. AAMAS 2016: 457-465 - [c45]Julien Pérolat, Bilal Piot, Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
Softened Approximate Policy Iteration for Markov Games. ICML 2016: 1860-1868 - 2015
- [c44]Deepika Singh, Erinc Merdivan, Sten Hanke, Johannes Kropf, Matthieu Geist, Andreas Holzinger:
Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment. BIRS-IMLKE 2015: 194-205 - [c43]Bilal Piot, Olivier Pietquin, Matthieu Geist:
Imitation Learning Applied to Embodied Conversational Agents. MLIS@ICML 2015: 1-5 - [c42]Thibaut Munzer, Bilal Piot, Matthieu Geist, Olivier Pietquin, Manuel Lopes:
Inverse Reinforcement Learning in Relational Domains. IJCAI 2015: 3735-3741 - 2014
- [c41]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted and reward-regularized classification for apprenticeship learning. AAMAS 2014: 1249-1256 - [c40]Bilal Piot, Olivier Pietquin, Matthieu Geist:
Predicting when to laugh with structured classification. INTERSPEECH 2014: 1786-1790 - [c39]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming for Reinforcement Learning. NIPS 2014: 2519-2527 - [c38]Bruno Scherrer, Matthieu Geist:
Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search. ECML/PKDD (3) 2014: 35-50 - [c37]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Boosted Bellman Residual Minimization Handling Expert Demonstrations. ECML/PKDD (2) 2014: 549-564 - 2013
- [c36]Radoslaw Niewiadomski, Jennifer Hofmann, Jérôme Urbain, Tracey Platt, Johannes Wagner, Bilal Piot, Hüseyin Çakmak, Sathish Pammi, Tobias Baur, Stéphane Dupont, Matthieu Geist, Florian Lingenfelser, Gary McKeown, Olivier Pietquin, Willibald Ruch:
Laugh-aware virtual agent and its impact on user amusement. AAMAS 2013: 619-626 - [c35]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Random projections: A remedy for overfitting issues in time series prediction with echo state networks. ICASSP 2013: 3253-3257 - [c34]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Particle swarm optimisation of spoken dialogue system strategies. INTERSPEECH 2013: 470-474 - [c33]Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin:
A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning. ECML/PKDD (1) 2013: 1-16 - [c32]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Learning from Demonstrations: Is It Worth Estimating a Reward Function? ECML/PKDD (1) 2013: 17-32 - [c31]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Model-free POMDP optimisation of tutoring systems with echo-state networks. SIGDIAL Conference 2013: 102-106 - 2012
- [c30]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Behavior Specific User Simulation in Spoken Dialogue Systems. ITG Conference on Speech Communication 2012: 1-4 - [c29]Jérémy Fix, Matthieu Geist:
Monte-Carlo Swarm Policy Search. ICAISC (SIDE-EC) 2012: 75-83 - [c28]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Clustering behaviors of Spoken Dialogue Systems users. ICASSP 2012: 4981-4984 - [c27]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Off-policy learning in large-scale POMDP-based dialogue systems. ICASSP 2012: 4989-4992 - [c26]Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh:
A Dantzig Selector Approach to Temporal Difference Learning. ICML 2012 - [c25]Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist:
Approximate Modified Policy Iteration. ICML 2012 - [c24]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
Co-adaptation in Spoken Dialogue Systems. IWSDS 2012: 343-353 - [c23]Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin:
Inverse Reinforcement Learning through Structured Classification. NIPS 2012: 1016-1024 - [c22]Lucie Daubigney, Matthieu Geist, Olivier Pietquin:
Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé (Optimization of a tutoring system from a fixed set of data) [in French]. JEP-TALN-RECITAL 2012 2012: 241-248 - 2011
- [c21]Matthieu Geist, Olivier Pietquin:
Parametric value function approximation: A unified view. ADPRL 2011: 9-16 - [c20]Jérémy Fix, Matthieu Geist, Olivier Pietquin, Hervé Frezza-Buet:
Dynamic neural field optimization using the unscented Kalman filter. CCMB 2011: 74-80 - [c19]Matthieu Geist, Bruno Scherrer:
ℓ1-Penalized Projected Bellman Residual. EWRL 2011: 89-101 - [c18]Bruno Scherrer, Matthieu Geist:
Recursive Least-Squares Learning with Eligibility Traces. EWRL 2011: 115-127 - [c17]Edouard Klein, Matthieu Geist, Olivier Pietquin:
Batch, Off-Policy and Model-Free Apprenticeship Learning. EWRL 2011: 285-296 - [c16]Remi Chou, Yvo Boers, Martin Podt, Matthieu Geist:
Performance evaluation for particle filters. FUSION 2011: 1-7 - [c15]Hadrien Glaude, Fadi Akrimi, Matthieu Geist, Olivier Pietquin:
A Non-parametric Approach to Approximate Dynamic Programming. ICMLA (1) 2011: 317-322 - [c14]Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan:
Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences. IJCAI 2011: 1878-1883 - [c13]Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:
User Simulation in Dialogue Systems Using Inverse Reinforcement Learning. INTERSPEECH 2011: 1025-1028 - [c12]Lucie Daubigney, Milica Gasic, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, Steve J. Young:
Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue System. INTERSPEECH 2011: 1301-1304 - [c11]Olivier Pietquin, Lucie Daubigney, Matthieu Geist:
Optimization of a tutoring system from a fixed set of data. SLaTE 2011: 97-100 - [c10]Matthieu Geist, Olivier Pietquin:
Managing Uncertainty within KTD. Active Learning and Experimental Design @ AISTATS 2011: 157-168 - 2010
- [c9]Matthieu Geist, Olivier Pietquin:
Statistically linearized least-squares temporal differences. ICUMT 2010: 450-457 - [c8]Matthieu Geist, Olivier Pietquin:
Eligibility traces through colored noises. ICUMT 2010: 458-465 - [c7]Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin:
Optimizing spoken dialogue management with fitted value iteration. INTERSPEECH 2010: 86-89 - [c6]Matthieu Geist, Olivier Pietquin:
Revisiting Natural Actor-Critics with Value Function Approximation. MDAI 2010: 207-218 - [c5]Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin:
Sparse Approximate Dynamic Programming for Dialog Management. SIGDIAL Conference 2010: 107-115 - 2009
- [c4]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Kalman Temporal Differences: The deterministic case. ADPRL 2009: 185-192 - [c3]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Kernelizing Vector Quantization Algorithms. ESANN 2009 - [c2]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Tracking in Reinforcement Learning. ICONIP (1) 2009: 502-511 - 2008
- [c1]Matthieu Geist, Olivier Pietquin, Gabriel Fricout:
Bayesian Reward Filtering. EWRL 2008: 96-109
Informal and Other Publications
- 2024
- [i82]Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach:
Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View. CoRR abs/2401.11237 (2024) - [i81]Geoffrey Cideron, Sertan Girgin, Mauro Verzetti, Damien Vincent, Matej Kastelic, Zalán Borsos, Brian McWilliams, Victor Ungureanu, Olivier Bachem, Olivier Pietquin, Matthieu Geist, Léonard Hussenot, Neil Zeghidour, Andrea Agostinelli:
MusicRL: Aligning Music Generation to Human Preferences. CoRR abs/2402.04229 (2024) - [i80]Zida Wu, Mathieu Laurière, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta:
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning. CoRR abs/2403.03552 (2024) - [i79]Andrej Orsula, Matthieu Geist, Miguel A. Olivares-Méndez, Carol Chamorro-Martínez:
Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space. CoRR abs/2405.01134 (2024) - [i78]Eugene Choi, Arash Ahmadian, Matthieu Geist, Olivier Pietquin, Mohammad Gheshlaghi Azar:
Self-Improving Robust Preference Optimization. CoRR abs/2406.01660 (2024) - [i77]Pierre Clavier, Emmanuel Rachelson, Erwan Le Pennec, Matthieu Geist:
Bootstrapping Expectiles in Reinforcement Learning. CoRR abs/2406.04081 (2024) - [i76]Adil Zouitine, David Bertoin, Pierre Clavier, Matthieu Geist, Emmanuel Rachelson:
Time-Constrained Robust MDPs. CoRR abs/2406.08395 (2024) - [i75]Adil Zouitine, David Bertoin, Pierre Clavier, Matthieu Geist, Emmanuel Rachelson:
RRLS : Robust Reinforcement Learning Suite. CoRR abs/2406.08406 (2024) - [i74]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i73]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - [i72]Markus Wulfmeier, Michael Bloesch, Nino Vieillard, Arun Ahuja, Jorg Bornschein, Sandy H. Huang, Artem Sokolov, Matt Barnes, Guillaume Desjardins, Alex Bewley, Sarah Maria Elisabeth Bechtle, Jost Tobias Springenberg, Nikola Momchev, Olivier Bachem, Matthieu Geist, Martin A. Riedmiller:
Imitating Language via Scalable Inverse Reinforcement Learning. CoRR abs/2409.01369 (2024) - 2023
- [i71]Divyansh Garg, Joey Hejna, Matthieu Geist, Stefano Ermon:
Extreme Q-Learning: MaxEnt RL without Entropy. CoRR abs/2301.02328 (2023) - [i70]Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Levy, Shie Mannor:
Policy Gradient for s-Rectangular Robust Markov Decision Processes. CoRR abs/2301.13589 (2023) - [i69]Pierre Clavier, Erwan Le Pennec, Matthieu Geist:
Towards Minimax Optimality of Model-based Robust Reinforcement Learning. CoRR abs/2302.05372 (2023) - [i68]Esther Derman, Yevgeniy Men, Matthieu Geist, Shie Mannor:
Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization. CoRR abs/2303.06654 (2023) - [i67]Geoffrey Cideron, Baruch Tabanpour, Sebastian Curi, Sertan Girgin, Léonard Hussenot, Gabriel Dulac-Arnold, Matthieu Geist, Olivier Pietquin, Robert Dadashi:
Get Back Here: Robust Imitation by Return-to-Distribution Planning. CoRR abs/2305.01400 (2023) - [i66]Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári, Wataru Kumagai, Yutaka Matsuo:
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice. CoRR abs/2305.13185 (2023) - [i65]Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Matthieu Geist, Yuejie Chi:
The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model. CoRR abs/2305.16589 (2023) - [i64]Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor:
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback. CoRR abs/2306.00186 (2023) - [i63]Rishabh Agarwal, Nino Vieillard, Piotr Stanczyk, Sabela Ramos, Matthieu Geist, Olivier Bachem:
GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models. CoRR abs/2306.13649 (2023) - [i62]Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist:
On Imitation in Mean-field Games. CoRR abs/2306.14799 (2023) - [i61]Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov:
A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning. CoRR abs/2307.12968 (2023) - [i60]Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist:
Offline Reinforcement Learning with On-Policy Q-Function Regularization. CoRR abs/2307.13824 (2023) - [i59]Matteo El Hariry, Antoine Richard, Vivek Muralidharan, Baris Can Yalçin, Matthieu Geist, Miguel A. Olivares-Méndez:
DRIFT: Deep Reinforcement Learning for Intelligent Floating Platforms Trajectories. CoRR abs/2310.04266 (2023) - [i58]Rémi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J. Mankowitz, Doina Precup, Bilal Piot:
Nash Learning from Human Feedback. CoRR abs/2312.00886 (2023) - [i57]Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Thomas Mesnard, Hado van Hasselt, Laura Toni:
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning. CoRR abs/2312.01072 (2023) - [i56]Kai Cui, Gökçe Dayanikli, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl:
Learning Discrete-Time Major-Minor Mean Field Games. CoRR abs/2312.10787 (2023) - 2022
- [i55]Alexis Jacq, Johan Ferret, Olivier Pietquin, Matthieu Geist:
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act. CoRR abs/2203.08542 (2022) - [i54]Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Élie, Olivier Pietquin, Matthieu Geist:
Scalable Deep Reinforcement Learning Algorithms for Mean Field Games. CoRR abs/2203.11973 (2022) - [i53]Mathieu Blondel, Felipe Llinares-López, Robert Dadashi, Léonard Hussenot, Matthieu Geist:
Learning Energy Networks with Generalized Fenchel-Young Losses. CoRR abs/2205.09589 (2022) - [i52]Mathieu Laurière, Sarah Perrin, Matthieu Geist, Olivier Pietquin:
Learning Mean Field Games: A Survey. CoRR abs/2205.12944 (2022) - [i51]Tadashi Kozuno, Wenhao Yang, Nino Vieillard, Toshinori Kitamura, Yunhao Tang, Jincheng Mei, Pierre Ménard, Mohammad Gheshlaghi Azar, Michal Valko, Rémi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvári:
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal. CoRR abs/2205.14211 (2022) - [i50]Paul Muller, Romuald Elie, Mark Rowland, Mathieu Laurière, Julien Pérolat, Sarah Perrin, Matthieu Geist, Georgios Piliouras, Olivier Pietquin, Karl Tuyls:
Learning Correlated Equilibria in Mean-Field Games. CoRR abs/2208.10138 (2022) - [i49]Alexis Jacq, Manu Orsini, Gabriel Dulac-Arnold, Olivier Pietquin, Matthieu Geist, Olivier Bachem:
C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining. CoRR abs/2211.03521 (2022) - [i48]Batuhan Yardim, Semih Cayci, Matthieu Geist, Niao He:
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games. CoRR abs/2212.14449 (2022) - 2021
- [i47]Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
Adversarially Guided Actor-Critic. CoRR abs/2102.04376 (2021) - [i46]Antoine Richard, Stéphanie Aravecchia, Thomas Schillaci, Matthieu Geist, Cédric Pradalier:
How To Train Your HERON. CoRR abs/2102.10357 (2021) - [i45]Julien Pérolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin:
Scaling up Mean Field Games with Online Mirror Descent. CoRR abs/2103.00623 (2021) - [i44]Robert Dadashi, Shideh Rezaeifar, Nino Vieillard, Léonard Hussenot, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning with Pseudometric Learning. CoRR abs/2103.01948 (2021) - [i43]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin:
Mean Field Games Flock! The Reinforcement Learning Way. CoRR abs/2105.07933 (2021) - [i42]Léonard Hussenot, Marcin Andrychowicz, Damien Vincent, Robert Dadashi, Anton Raichuk, Lukasz Stafiniak, Sertan Girgin, Raphaël Marinier, Nikola Momchev, Sabela Ramos, Manu Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin:
Hyperparameter Selection for Imitation Learning. CoRR abs/2105.12034 (2021) - [i41]Manu Orsini, Anton Raichuk, Léonard Hussenot, Damien Vincent, Robert Dadashi, Sertan Girgin, Matthieu Geist, Olivier Bachem, Olivier Pietquin, Marcin Andrychowicz:
What Matters for Adversarial Imitation Learning? CoRR abs/2106.00672 (2021) - [i40]Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin:
Concave Utility Reinforcement Learning: the Mean-field Game viewpoint. CoRR abs/2106.03787 (2021) - [i39]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. CoRR abs/2106.04480 (2021) - [i38]Shideh Rezaeifar, Robert Dadashi, Nino Vieillard, Léonard Hussenot, Olivier Bachem, Olivier Pietquin, Matthieu Geist:
Offline Reinforcement Learning as Anti-Exploration. CoRR abs/2106.06431 (2021) - [i37]Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux:
A functional mirror ascent view of policy gradient methods with function approximation. CoRR abs/2108.05828 (2021) - [i36]Nino Vieillard, Marcin Andrychowicz, Anton Raichuk, Olivier Pietquin, Matthieu Geist:
Implicitly Regularized RL with Implicit Q-Values. CoRR abs/2108.07041 (2021) - [i35]Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin:
Generalization in Mean Field Games by Learning Master Policies. CoRR abs/2109.09717 (2021) - [i34]Thibault Lahire, Matthieu Geist, Emmanuel Rachelson:
Large Batch Experience Replay. CoRR abs/2110.01528 (2021) - [i33]Esther Derman, Matthieu Geist, Shie Mannor:
Twice regularized MDPs and the equivalence between robustness and regularization. CoRR abs/2110.06267 (2021) - [i32]Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin:
Continuous Control with Action Quantization from Demonstrations. CoRR abs/2110.10149 (2021) - 2020
- [i31]Nino Vieillard, Tadashi Kozuno, Bruno Scherrer, Olivier Pietquin, Rémi Munos, Matthieu Geist:
Leverage the Average: an Analysis of Regularization in RL. CoRR abs/2003.14089 (2020) - [i30]Daoming Lyu, Bo Liu, Matthieu Geist, Wen Dong, Saad Biaz, Qi Wang:
Stable and Efficient Policy Evaluation. CoRR abs/2006.03978 (2020) - [i29]Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Primal Wasserstein Imitation Learning. CoRR abs/2006.04678 (2020) - [i28]Marcin Andrychowicz, Anton Raichuk, Piotr Stanczyk, Manu Orsini, Sertan Girgin, Raphaël Marinier, Léonard Hussenot, Matthieu Geist, Olivier Pietquin, Marcin Michalski, Sylvain Gelly, Olivier Bachem:
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study. CoRR abs/2006.05990 (2020) - [i27]Léonard Hussenot, Robert Dadashi, Matthieu Geist, Olivier Pietquin:
Show me the Way: Intrinsic Motivation from Demonstrations. CoRR abs/2006.12917 (2020) - [i26]Sarah Perrin, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin:
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. CoRR abs/2007.03458 (2020) - [i25]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Munchausen Reinforcement Learning. CoRR abs/2007.14430 (2020) - [i24]Johan Ferret, Olivier Pietquin, Matthieu Geist:
Self-Imitation Advantage Learning. CoRR abs/2012.11989 (2020) - 2019
- [i23]Matthieu Geist, Bruno Scherrer, Olivier Pietquin:
A Theory of Regularized Markov Decision Processes. CoRR abs/1901.11275 (2019) - [i22]Léonard Hussenot, Matthieu Geist, Olivier Pietquin:
Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations. CoRR abs/1905.12282 (2019) - [i21]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
Deep Conservative Policy Iteration. CoRR abs/1906.09784 (2019) - [i20]Alexis Jacq, Julien Pérolat, Matthieu Geist, Olivier Pietquin:
Foolproof Cooperative Learning. CoRR abs/1906.09831 (2019) - [i19]Lucas Beyer, Damien Vincent, Olivier Teboul, Sylvain Gelly, Matthieu Geist, Olivier Pietquin:
MULEX: Disentangling Exploitation from Exploration in Deep RL. CoRR abs/1907.00868 (2019) - [i18]Erinc Merdivan, Sten Hanke, Matthieu Geist:
Modified Actor-Critics. CoRR abs/1907.01298 (2019) - [i17]Romuald Elie, Julien Pérolat, Mathieu Laurière, Matthieu Geist, Olivier Pietquin:
Approximate Fictitious Play for Mean Field Games. CoRR abs/1907.02633 (2019) - [i16]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
ELF: Embedded Localisation of Features in pre-trained CNN. CoRR abs/1907.03261 (2019) - [i15]Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin:
Credit Assignment as a Proxy for Transfer in Reinforcement Learning. CoRR abs/1907.08027 (2019) - [i14]Nino Vieillard, Olivier Pietquin, Matthieu Geist:
On Connections between Constrained Optimization and Reinforcement Learning. CoRR abs/1910.08476 (2019) - [i13]Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist:
Momentum in Reinforcement Learning. CoRR abs/1910.09322 (2019) - [i12]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description. CoRR abs/1910.12468 (2019) - 2018
- [i11]Ismini Psychoula, Erinc Merdivan, Deepika Singh, Liming Chen, Feng Chen, Sten Hanke, Johannes Kropf, Andreas Holzinger, Matthieu Geist:
A Deep Learning Approach for Privacy Preservation in Assisted Living. CoRR abs/1802.09359 (2018) - [i10]Deepika Singh, Erinc Merdivan, Ismini Psychoula, Johannes Kropf, Sten Hanke, Matthieu Geist, Andreas Holzinger:
Human Activity Recognition using Recurrent Neural Networks. CoRR abs/1804.07144 (2018) - [i9]Assia Benbihi, Matthieu Geist, Cédric Pradalier:
Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation. CoRR abs/1805.04141 (2018) - [i8]Matthieu Geist, Bruno Scherrer:
Anderson Acceleration for Reinforcement Learning. CoRR abs/1809.09501 (2018) - [i7]Erinc Merdivan, Anastasios Vafeiadis, Dimitrios Kalatzis, Sten Hanke, Johannes Kropf, Konstantinos Votis, Dimitrios Giakoumis, Dimitrios Tzovaras, Liming Chen, Raouf Hamzaoui, Matthieu Geist:
Image-based Natural Language Understanding Using 2D Convolutional Neural Networks. CoRR abs/1810.10401 (2018) - 2016
- [i6]Bilal Piot, Matthieu Geist, Olivier Pietquin:
Difference of Convex Functions Programming Applied to Control with Expert Data. CoRR abs/1606.01128 (2016) - [i5]Matthieu Geist, Bilal Piot, Olivier Pietquin:
Should one minimize the expected Bellman residual or maximize the mean value? CoRR abs/1606.07636 (2016) - 2014
- [i4]Matthieu Geist, Olivier Pietquin:
Kalman Temporal Differences. CoRR abs/1406.3270 (2014) - 2013
- [i3]Matthieu Geist, Bruno Scherrer:
Off-policy Learning with Eligibility Traces: A Survey. CoRR abs/1304.3999 (2013) - [i2]Bruno Scherrer, Matthieu Geist:
Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee. CoRR abs/1306.1520 (2013) - 2012
- [i1]Bruno Scherrer, Victor Gabillon, Mohammad Ghavamzadeh, Matthieu Geist:
Approximate Modified Policy Iteration. CoRR abs/1205.3054 (2012)
Coauthor Index
aka: Romuald Élie
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-16 21:22 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint