


Остановите войну!
for scientists:


default search action
Shie Mannor
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [i185]Shie Mannor, Aviv Tamar:
Towards Deployable RL - What's Broken with RL Research and a Potential Fix. CoRR abs/2301.01320 (2023) - 2022
- [j79]Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, Shie Mannor:
Reinforcement Learning for Datacenter Congestion Control. SIGMETRICS Perform. Evaluation Rev. 49(2): 43-46 (2022) - [c224]Lior Shani, Tom Zahavy, Shie Mannor:
Online Apprenticeship Learning. AAAI 2022: 8240-8248 - [c223]Roy Zohar, Shie Mannor, Guy Tennenholtz:
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning. AAAI 2022: 9278-9285 - [c222]Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, Shie Mannor:
Reinforcement Learning for Datacenter Congestion Control. AAAI 2022: 12615-12621 - [c221]Shie Mannor:
Reinforcement Learning for Extended Intelligence. ICINCO 2022: 5 - [c220]Guy Tennenholtz, Assaf Hallak, Gal Dalal, Shie Mannor, Gal Chechik, Uri Shalit:
On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning. ICLR 2022 - [c219]Shirli Di-Castro Shashua, Shie Mannor, Dotan Di Castro:
Analysis of Stochastic Processes through Replay Buffers. ICML 2022: 5039-5060 - [c218]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms. ICML 2022: 11772-11789 - [c217]Eli A. Meirom, Haggai Maron, Shie Mannor, Gal Chechik:
Optimizing Tensor Network Contraction Using Reinforcement Learning. ICML 2022: 15278-15292 - [c216]Kaixin Wang, Navdeep Kumar, Kuangqi Zhou, Bryan Hooi, Jiashi Feng, Shie Mannor:
The Geometry of Robust Value Functions. ICML 2022: 22727-22751 - [c215]Mohammadi Zaki, Avi Mohan, Aditya Gopalan, Shie Mannor:
Actor-Critic based Improper Reinforcement Learning. ICML 2022: 25867-25919 - [i184]Aviv Rosenberg, Assaf Hallak, Shie Mannor, Gal Chechik, Gal Dalal:
Planning and Learning with Adaptive Lookahead. CoRR abs/2201.12403 (2022) - [i183]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms. CoRR abs/2201.12700 (2022) - [i182]Kaixin Wang, Navdeep Kumar, Kuangqi Zhou, Bryan Hooi, Jiashi Feng, Shie Mannor:
The Geometry of Robust Value Functions. CoRR abs/2201.12929 (2022) - [i181]Stav Belogolovsky, Ido Greenberg, Danny Eitan, Shie Mannor:
Continuous Forecasting via Neural Eigen Decomposition of Stochastic Dynamics. CoRR abs/2202.00117 (2022) - [i180]Yuval Atzmon, Eli A. Meirom, Shie Mannor, Gal Chechik:
Learning to reason about and to act on physical cascading events. CoRR abs/2202.01108 (2022) - [i179]Binyamin Perets, Mark Kozdoba, Shie Mannor:
Whats Missing? Learning Hidden Markov Models When the Locations of Missing Observations are Unknown. CoRR abs/2203.06527 (2022) - [i178]Eli A. Meirom, Haggai Maron, Shie Mannor, Gal Chechik:
Optimizing Tensor Network Contraction Using Reinforcement Learning. CoRR abs/2204.09052 (2022) - [i177]Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, Shie Mannor:
Efficient Risk-Averse Reinforcement Learning. CoRR abs/2205.05138 (2022) - [i176]Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor:
Efficient Policy Iteration for Robust Markov Decision Processes via Regularization. CoRR abs/2205.14327 (2022) - [i175]Guy Tennenholtz, Nadav Merlis, Lior Shani, Shie Mannor, Uri Shalit, Gal Chechik, Assaf Hallak, Gal Dalal:
Reinforcement Learning with a Terminator. CoRR abs/2205.15376 (2022) - [i174]Shirli Di-Castro Shashua, Shie Mannor, Dotan Di Castro:
Analysis of Stochastic Processes through Replay Buffers. CoRR abs/2206.12848 (2022) - [i173]Benjamin Fuhrer, Yuval Shpigelman, Chen Tessler, Shie Mannor, Gal Chechik, Eitan Zahavi, Gal Dalal:
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs. CoRR abs/2207.02295 (2022) - [i172]Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor:
Actor-Critic based Improper Reinforcement Learning. CoRR abs/2207.09090 (2022) - [i171]Gal Dalal, Assaf Hallak, Shie Mannor, Gal Chechik:
SoftTreeMax: Policy Gradient with Tree Search. CoRR abs/2209.13966 (2022) - [i170]Navdeep Kumar, Kaixin Wang, Kfir Levy, Shie Mannor:
Policy Gradient for Reinforcement Learning with General Utilities. CoRR abs/2210.00991 (2022) - [i169]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Reward-Mixing MDPs with a Few Latent Contexts are Learnable. CoRR abs/2210.02594 (2022) - [i168]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Tractable Optimality in Episodic Latent MABs. CoRR abs/2210.03528 (2022) - [i167]Péter Karkus, Boris Ivanovic, Shie Mannor, Marco Pavone:
DiffStack: A Differentiable and Modular Control Stack for Autonomous Vehicles. CoRR abs/2212.06437 (2022) - 2021
- [j78]Stav Belogolovsky
, Philip Korsunsky, Shie Mannor, Chen Tessler, Tom Zahavy:
Inverse reinforcement learning in contextual MDPs. Mach. Learn. 110(9): 2295-2334 (2021) - [c214]Yonathan Efroni, Nadav Merlis, Shie Mannor:
Reinforcement Learning with Trajectory Feedback. AAAI 2021: 7288-7295 - [c213]Nadav Merlis, Shie Mannor:
Lenient Regret for Multi-Armed Bandits. AAAI 2021: 8950-8957 - [c212]Avi Mohan, Shie Mannor, Arman C. Kizilkale:
On the Volatility of Optimal Control Policies of a Class of Linear Quadratic Regulators. ACC 2021: 4533-4540 - [c211]Roi Pony, Itay Naeh, Shie Mannor:
Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks. CVPR 2021: 515-524 - [c210]Esther Derman, Gal Dalal, Shie Mannor:
Acting in Delayed Environments with Non-Stationary Markov Policies. ICLR 2021 - [c209]Shauharda Khadka, Estelle Aflalo, Mattias Marder, Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar:
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning. ICLR 2021 - [c208]Yonathan Efroni, Nadav Merlis, Aadirupa Saha, Shie Mannor:
Confidence-Budget Matching for Sequential Budgeted Learning. ICML 2021: 2937-2947 - [c207]Ido Greenberg, Shie Mannor:
Detecting Rewards Deterioration in Episodic Reinforcement Learning. ICML 2021: 3842-3853 - [c206]Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg:
Value Iteration in Continuous Actions, States and Time. ICML 2021: 7224-7234 - [c205]Eli A. Meirom, Haggai Maron, Shie Mannor, Gal Chechik:
Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks. ICML 2021: 7565-7577 - [c204]Ofir Nabati, Tom Zahavy, Shie Mannor:
Online Limited Memory Neural-Linear Bandits with Likelihood Matching. ICML 2021: 7905-7915 - [c203]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Reinforcement Learning in Reward-Mixing MDPs. NeurIPS 2021: 2253-2264 - [c202]Gal Dalal, Assaf Hallak, Steven Dalton, Iuri Frosio, Shie Mannor, Gal Chechik:
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction. NeurIPS 2021: 5518-5530 - [c201]Shirli Di-Castro Shashua, Dotan Di Castro, Shie Mannor:
Sim and Real: Better Together. NeurIPS 2021: 6868-6880 - [c200]Esther Derman, Matthieu Geist, Shie Mannor:
Twice regularized MDPs and the equivalence between robustness and regularization. NeurIPS 2021: 22274-22287 - [c199]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
RL for Latent MDPs: Regret Guarantees and a Lower Bound. NeurIPS 2021: 24523-24534 - [c198]Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg:
Robust Value Iteration for Continuous Control Tasks. Robotics: Science and Systems 2021 - [c197]Nir Baram, Guy Tennenholtz, Shie Mannor:
Action redundancy in reinforcement learning. UAI 2021: 376-385 - [c196]Guy Tennenholtz, Uri Shalit, Shie Mannor, Yonathan Efroni:
Bandits with partially observable confounded data. UAI 2021: 430-439 - [c195]Harsh Agrawal, Eli A. Meirom, Yuval Atzmon, Shie Mannor, Gal Chechik:
Known unknowns: Learning novel concepts using reasoning-by-elimination. UAI 2021: 504-514 - [i166]Esther Derman, Gal Dalal, Shie Mannor:
Acting in Delayed Environments with Non-Stationary Markov Policies. CoRR abs/2101.11992 (2021) - [i165]Yonathan Efroni, Nadav Merlis, Aadirupa Saha, Shie Mannor:
Confidence-Budget Matching for Sequential Budgeted Learning. CoRR abs/2102.03400 (2021) - [i164]Ofir Nabati, Tom Zahavy, Shie Mannor:
Online Limited Memory Neural-Linear Bandits with Likelihood Matching. CoRR abs/2102.03799 (2021) - [i163]Mark Kozdoba, Shie Mannor:
Dimension Free Generalization Bounds for Non Linear Metric Learning. CoRR abs/2102.03802 (2021) - [i162]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
RL for Latent MDPs: Regret Guarantees and a Lower Bound. CoRR abs/2102.04939 (2021) - [i161]Lior Shani, Tom Zahavy, Shie Mannor:
Online Apprenticeship Learning. CoRR abs/2102.06924 (2021) - [i160]Mohammadi Zaki, Avinash Mohan, Aditya Gopalan, Shie Mannor:
Improper Learning with Gradient-based Policy Optimization. CoRR abs/2102.08201 (2021) - [i159]Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, Shie Mannor:
Reinforcement Learning for Datacenter Congestion Control. CoRR abs/2102.09337 (2021) - [i158]Guy Tennenholtz, Nir Baram, Shie Mannor:
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning. CoRR abs/2102.11327 (2021) - [i157]Nir Baram, Guy Tennenholtz, Shie Mannor:
Action Redundancy in Reinforcement Learning. CoRR abs/2102.11329 (2021) - [i156]Nir Baram, Guy Tennenholtz, Shie Mannor:
Maximum Entropy Reinforcement Learning with Mixture Policies. CoRR abs/2103.10176 (2021) - [i155]Ido Greenberg, Shie Mannor, Netanel Yannay:
Using Kalman Filter The Right Way: Noise Estimation Is Not Optimal. CoRR abs/2104.02372 (2021) - [i154]Mohammadi Zaki, Avi Mohan, Aditya Gopalan, Shie Mannor:
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling. CoRR abs/2105.00210 (2021) - [i153]Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg:
Value Iteration in Continuous Actions, States and Time. CoRR abs/2105.04682 (2021) - [i152]Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg:
Robust Value Iteration for Continuous Control Tasks. CoRR abs/2105.12189 (2021) - [i151]Assaf Hallak, Gal Dalal, Steven Dalton, Iuri Frosio, Shie Mannor, Gal Chechik:
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction. CoRR abs/2107.01715 (2021) - [i150]Roy Zohar, Shie Mannor, Guy Tennenholtz:
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2109.10632 (2021) - [i149]Shirli Di-Castro Shashua, Dotan Di Castro, Shie Mannor:
Sim and Real: Better Together. CoRR abs/2110.00445 (2021) - [i148]Michael Lutter, Boris Belousov, Shie Mannor, Dieter Fox, Animesh Garg, Jan Peters:
Continuous-Time Fitted Value Iteration for Robust Policies. CoRR abs/2110.01954 (2021) - [i147]Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor:
Reinforcement Learning in Reward-Mixing MDPs. CoRR abs/2110.03743 (2021) - [i146]Nadav Merlis, Yonathan Efroni, Shie Mannor:
Dare not to Ask: Problem-Dependent Guarantees for Budgeted Bandits. CoRR abs/2110.05724 (2021) - [i145]Esther Derman, Matthieu Geist, Shie Mannor:
Twice regularized MDPs and the equivalence between robustness and regularization. CoRR abs/2110.06267 (2021) - [i144]Guy Tennenholtz, Assaf Hallak, Gal Dalal, Shie Mannor, Gal Chechik, Uri Shalit:
On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning. CoRR abs/2110.06539 (2021) - 2020
- [c194]Lior Shani, Yonathan Efroni, Shie Mannor:
Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs. AAAI 2020: 5668-5675 - [c193]Guy Tennenholtz, Uri Shalit, Shie Mannor:
Off-Policy Evaluation in Partially Observable Environments. AAAI 2020: 10276-10283 - [c192]Xavier Fontaine, Shie Mannor, Vianney Perchet:
An adaptive stochastic optimization algorithm for resource allocation. ALT 2020: 319-363 - [c191]Nadav Merlis, Shie Mannor:
Tight Lower Bounds for Combinatorial Multi-Armed Bandits. COLT 2020: 2830-2857 - [c190]Dan Fisher, Mark Kozdoba, Shie Mannor:
Topic Modeling via Full Dependence Mixtures. ICML 2020: 3188-3198 - [c189]Lior Shani, Yonathan Efroni, Aviv Rosenberg, Shie Mannor:
Optimistic Policy Optimization with Bandit Feedback. ICML 2020: 8604-8613 - [c188]Yonathan Efroni, Mohammad Ghavamzadeh, Shie Mannor:
Online Planning with Lookahead Policies. NeurIPS 2020 - [c187]Shreyansh Gandhi, Samrat Kokkula, Abon Chaudhuri, Alessandro Magnani, Theban Stanley, Behzad Ahmadi, Venkatesh Kandaswamy, Omer Ovenc, Shie Mannor
:
Scalable Detection of Offensive and Non-compliant Content / Logo in Product Images. WACV 2020: 2236-2245 - [i143]Chen Tessler, Shie Mannor:
Maximizing the Total Reward via Reward Tweaking. CoRR abs/2002.03327 (2020) - [i142]Itay Naeh, Roi Pony, Shie Mannor:
Patternless Adversarial Attacks on Video Recognition Networks. CoRR abs/2002.05123 (2020) - [i141]Nadav Merlis, Shie Mannor:
Tight Lower Bounds for Combinatorial Multi-Armed Bandits. CoRR abs/2002.05392 (2020) - [i140]Avinash Mohan, Shie Mannor, Arman C. Kizilkale:
Price Volatility in Electricity Markets: A Stochastic Control Perspective. CoRR abs/2002.06808 (2020) - [i139]Shirli Di-Castro Shashua, Shie Mannor:
Kalman meets Bellman: Improving Policy Evaluation through Value Tracking. CoRR abs/2002.07171 (2020) - [i138]Yonathan Efroni, Lior Shani, Aviv Rosenberg, Shie Mannor:
Optimistic Policy Optimization with Bandit Feedback. CoRR abs/2002.08243 (2020) - [i137]Daniel Teitelman, Itay Naeh, Shie Mannor:
Stealing Black-Box Functionality Using The Deep Neural Tree Architecture. CoRR abs/2002.09864 (2020) - [i136]Yonathan Efroni, Shie Mannor, Matteo Pirotta:
Exploration-Exploitation in Constrained MDPs. CoRR abs/2003.02189 (2020) - [i135]Esther Derman, Shie Mannor:
Distributional Robustness and Regularization in Reinforcement Learning. CoRR abs/2003.02894 (2020) - [i134]Guy Tennenholtz, Uri Shalit, Shie Mannor, Yonathan Efroni:
Bandits with Partially Observable Offline Data. CoRR abs/2006.06731 (2020) - [i133]Shauharda Khadka, Estelle Aflalo, Mattias Marder, Avrech Ben-David, Santiago Miret, Hanlin Tang, Shie Mannor, Tamir Hazan, Somdeb Majumdar:
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning. CoRR abs/2007.07298 (2020) - [i132]Asaf B. Cassel, Shie Mannor, Guy Tennenholtz:
The Pendulum Arrangement: Maximizing the Escape Time of Heterogeneous Random Walks. CoRR abs/2007.13232 (2020) - [i131]Nadav Merlis, Shie Mannor:
Lenient Regret for Multi-Armed Bandits. CoRR abs/2008.03959 (2020) - [i130]Yonathan Efroni, Nadav Merlis, Shie Mannor:
Reinforcement Learning with Trajectory Feedback. CoRR abs/2008.06036 (2020) - [i129]Eli A. Meirom, Haggai Maron, Shie Mannor, Gal Chechik:
How to Stop Epidemics: Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks. CoRR abs/2010.05313 (2020) - [i128]Ido Greenberg, Shie Mannor
:
Drift Detection in Episodic Data: Detect When Your Agent Starts Faltering. CoRR abs/2010.11660 (2020) - [i127]Ahmet Fatih Inci, Evgeny Bolotin, Yaosheng Fu, Gal Dalal, Shie Mannor
, David W. Nellans, Diana Marculescu:
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems. CoRR abs/2012.04210 (2020)
2010 – 2019
- 2019
- [j77]Orly Avner
, Shie Mannor
:
Multi-User Communication Networks: A Coordinated Multi-Armed Bandit Approach. IEEE/ACM Trans. Netw. 27(6): 2192-2207 (2019) - [c186]Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor:
How to Combine Tree-Search Methods in Reinforcement Learning. AAAI 2019: 3494-3501 - [c185]Mark Kozdoba, Jakub Marecek
, Tigran T. Tchrakian, Shie Mannor:
On-Line Learning of Linear Dynamical Systems: Exponential Forgetting in Kalman Filters. AAAI 2019: 4098-4105 - [c184]Nadav Merlis, Shie Mannor:
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem. COLT 2019: 2465-2489 - [c183]Chen Tessler, Daniel J. Mankowitz, Shie Mannor:
Reward Constrained Policy Optimization. ICLR (Poster) 2019 - [c182]Chao Qu, Shie Mannor, Huan Xu:
Nonlinear Distributional Gradient Temporal-Difference Learning. ICML 2019: 5251-5260 - [c181]Lior Shani, Yonathan Efroni, Shie Mannor:
Exploration Conscious Reinforcement Learning Revisited. ICML 2019: 5680-5689 - [c180]Guy Tennenholtz, Shie Mannor:
The Natural Language of Actions. ICML 2019: 6196-6205 - [c179]Chen Tessler, Yonathan Efroni, Shie Mannor:
Action Robust Reinforcement Learning and Applications in Continuous Control. ICML 2019: 6215-6224 - [c178]Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, Junwu Xiong:
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning. NeurIPS 2019: 1182-1191 - [c177]Chen Tessler, Guy Tennenholtz, Shie Mannor:
Distributional Policy Optimization: An Alternative Approach for Continuous Control. NeurIPS 2019: 1350-1360 - [c176]Yonathan Efroni, Nadav Merlis, Mohammad Ghavamzadeh, Shie Mannor:
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies. NeurIPS 2019: 12203-12213 - [c175]Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor:
A Bayesian Approach to Robust Reinforcement Learning. UAI 2019: 648-658 - [i126]Shirli Di-Castro Shashua, Shie Mannor:
Trust Region Value Optimization using Kalman Filtering. CoRR abs/1901.07860 (2019) - [i125]Tom Zahavy, Shie Mannor:
Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching. CoRR abs/1901.08612 (2019) - [i124]Chen Tessler, Yonathan Efroni, Shie Mannor:
Action Robust Reinforcement Learning and Applications in Continuous Control. CoRR abs/1901.09184 (2019) - [i123]Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, Junwu Xiong:
Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning. CoRR abs/1901.09326 (2019) - [i122]Guy Tennenholtz, Shie Mannor:
The Natural Language of Actions. CoRR abs/1902.01119 (2019) - [i121]Xavier Fontaine, Shie Mannor, Vianney Perchet:
A Problem-Adaptive Algorithm for Resource Allocation. CoRR abs/1902.04376 (2019) - [i120]Shreyansh Gandhi, Samrat Kokkula, Abon Chaudhuri, Alessandro Magnani, Theban Stanley, Behzad Ahmadi, Venkatesh Kandaswamy, Omer Ovenc, Shie Mannor:
Image Matters: Detecting Offensive and Non-Compliant Content / Logo in Product Images. CoRR abs/1905.02234 (2019) - [i119]Nadav Merlis, Shie Mannor:
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem. CoRR abs/1905.03125 (2019) - [i118]Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor:
A Bayesian Approach to Robust Reinforcement Learning. CoRR abs/1905.08188 (2019) - [i117]Chen Tessler, Tom Zahavy, Deborah Cohen, Daniel J. Mankowitz, Shie Mannor:
Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces. CoRR abs/1905.09700 (2019) - [i116]Philip Korsunsky, Stav Belogolovsky, Tom Zahavy, Chen Tessler, Shie Mannor:
Inverse Reinforcement Learning in Contextual MDPs. CoRR abs/1905.09710 (2019) - [i115]Chen Tessler, Guy Tennenholtz, Shie Mannor:
Distributional Policy Optimization: An Alternative Approach for Continuous Control. CoRR abs/1905.09855 (2019) - [i114]