


Остановите войну!
for scientists:


default search action
Lihong Li 0001
Person information

- affiliation: Amazon, Seattle, WA, USA
- affiliation (former): Google, Kirkland, WA, USA
- affiliation (former): Microsoft Research, Redmond, WA, USA
- affiliation (former): Yahoo! Research, Santa Clara, CA, USA
- affiliation (former): Rutgers University, Piscataway, NJ, USA
- affiliation (former): University of Alberta, Edmonton, AB, Canada
Other persons with the same name
- Lihong Li — disambiguation page
- Lihong Li 0002
— City University of New York, NY, USA
- Lihong Li 0003 — Hebei Polytechnic University, Tangshan, Hebei, China
- Lihong Li 0004 — Chongqing University of Technology, Chongqing, China
- Lihong Li 0005 — Hebei University of Engineering, Handan, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [c89]Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, Liwei Wang:
Understanding Domain Randomization for Sim-to-real Transfer. ICLR 2022 - [c88]Ziyang Tang, Yiheng Duan, Steven Zhu, Stephanie Zhang, Lihong Li:
Estimating Long-term Effects from Experimental Data. RecSys 2022: 516-518 - [i66]Ziyang Tang, Yiheng Duan, Stephanie Zhang, Lihong Li:
A Reinforcement Learning Approach to Estimating Long-term Treatment Effects. CoRR abs/2210.07536 (2022) - 2021
- [j15]Yuxi Li, Alborz Geramifard, Lihong Li, Csaba Szepesvári, Tao Wang:
Guest editorial: special issue on reinforcement learning for real life. Mach. Learn. 110(9): 2291-2293 (2021) - [c87]Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi:
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders. AISTATS 2021: 1999-2007 - [c86]Xiaoyu Chen, Jiachen Hu, Lihong Li, Liwei Wang:
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL. ICLR 2021 - [c85]Weitong Zhang, Dongruo Zhou, Lihong Li, Quanquan Gu:
Neural Thompson Sampling. ICLR 2021 - [c84]Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, Liwei Wang:
Near-Optimal Representation Learning for Linear Bandits and Linear RL. ICML 2021: 4349-4358 - [c83]Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. ICML 2021: 11362-11371 - [i65]Jiachen Hu, Xiaoyu Chen, Chi Jin, Lihong Li, Liwei Wang:
Near-optimal Representation Learning for Linear Bandits and Linear RL. CoRR abs/2102.04132 (2021) - [i64]Chenjun Xiao, Yifan Wu, Tor Lattimore, Bo Dai, Jincheng Mei, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
On the Optimality of Batch Policy Optimization Algorithms. CoRR abs/2104.02293 (2021) - [i63]Yi Liu, Lihong Li:
A Map of Bandits for E-commerce. CoRR abs/2107.00680 (2021) - [i62]Xiaoyu Chen, Jiachen Hu, Chi Jin, Lihong Li, Liwei Wang:
Understanding Domain Randomization for Sim-to-real Transfer. CoRR abs/2110.03239 (2021) - 2020
- [c82]Branislav Kveton, Manzil Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier:
Randomized Exploration in Generalized Linear Bandits. AISTATS 2020: 2066-2076 - [c81]Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou:
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning. ICLR 2020 - [c80]Ziyang Tang, Yihao Feng, Lihong Li, Dengyong Zhou, Qiang Liu:
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation. ICLR 2020 - [c79]Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans:
GenDICE: Generalized Offline Estimation of Stationary Values. ICLR 2020 - [c78]Junfeng Wen, Bo Dai, Lihong Li, Dale Schuurmans:
Batch Stationary Distribution Estimation. ICML 2020: 10203-10213 - [c77]Dongruo Zhou, Lihong Li, Quanquan Gu:
Neural Contextual Bandits with UCB-based Exploration. ICML 2020: 11492-11502 - [c76]Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
CoinDICE: Off-Policy Confidence Interval Estimation. NeurIPS 2020 - [c75]Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
Escaping the Gravitational Pull of Softmax. NeurIPS 2020 - [c74]Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans:
Off-Policy Evaluation via the Regularized Lagrangian. NeurIPS 2020 - [i61]Ge Liu, Rui Wu, Heng-Tze Cheng, Jing Wang, Jayden Ooi, Lihong Li, Ang Li, Wai Lok Sibon Li, Craig Boutilier, Ed H. Chi:
Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing. CoRR abs/2002.05229 (2020) - [i60]Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans:
GenDICE: Generalized Offline Estimation of Stationary Values. CoRR abs/2002.09072 (2020) - [i59]Junfeng Wen, Bo Dai, Lihong Li, Dale Schuurmans:
Batch Stationary Distribution Estimation. CoRR abs/2003.00722 (2020) - [i58]Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou:
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning. CoRR abs/2003.11126 (2020) - [i57]Mengjiao Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans:
Off-Policy Evaluation via the Regularized Lagrangian. CoRR abs/2007.03438 (2020) - [i56]Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi:
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders. CoRR abs/2007.13893 (2020) - [i55]Xiaoyu Chen, Jiachen Hu, Lihong Li, Liwei Wang:
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL. CoRR abs/2008.13319 (2020) - [i54]Weitong Zhang
, Dongruo Zhou, Lihong Li, Quanquan Gu:
Neural Thompson Sampling. CoRR abs/2010.00827 (2020) - [i53]Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans:
CoinDICE: Off-Policy Confidence Interval Estimation. CoRR abs/2010.11652 (2020)
2010 – 2019
- 2019
- [j14]Lihong Li:
A perspective on off-policy evaluation in reinforcement learning. Frontiers Comput. Sci. 13(5): 911-912 (2019) - [j13]Jianfeng Gao, Michel Galley, Lihong Li:
Neural Approaches to Conversational AI. Found. Trends Inf. Retr. 13(2-3): 127-298 (2019) - [c73]Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, Denny Zhou:
Neural Logic Machines. ICLR (Poster) 2019 - [c72]Christoph Dann, Lihong Li, Wei Wei, Emma Brunskill:
Policy Certificates: Towards Accountable Reinforcement Learning. ICML 2019: 1507-1516 - [c71]Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li:
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections. NeurIPS 2019: 2315-2325 - [c70]Yihao Feng, Lihong Li, Qiang Liu:
A Kernel Loss for Solving the Bellman Equation. NeurIPS 2019: 15430-15441 - [i52]Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, Denny Zhou:
Neural Logic Machines. CoRR abs/1904.11694 (2019) - [i51]Yihao Feng, Lihong Li, Qiang Liu:
A Kernel Loss for Solving the Bellman Equation. CoRR abs/1905.10506 (2019) - [i50]Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li:
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections. CoRR abs/1906.04733 (2019) - [i49]Branislav Kveton, Manzil Zaheer, Csaba Szepesvári, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier:
Randomized Exploration in Generalized Linear Bandits. CoRR abs/1906.08947 (2019) - [i48]Ziyang Tang, Yihao Feng, Lihong Li, Dengyong Zhou, Qiang Liu:
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation. CoRR abs/1910.07186 (2019) - [i47]Dongruo Zhou, Lihong Li, Quanquan Gu:
Neural Contextual Bandits with Upper Confidence Bound-Based Exploration. CoRR abs/1911.04462 (2019) - [i46]Ofir Nachum, Bo Dai, Ilya Kostrikov, Yinlam Chow, Lihong Li, Dale Schuurmans:
AlgaeDICE: Policy Gradient from Arbitrary Experience. CoRR abs/1912.02074 (2019) - 2018
- [c69]Zachary C. Lipton, Xiujun Li, Jianfeng Gao, Lihong Li, Faisal Ahmed, Li Deng:
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems. AAAI 2018: 5237-5244 - [c68]Jianfeng Gao, Michel Galley, Lihong Li:
Neural Approaches to Conversational AI. ACL (5) 2018: 2-7 - [c67]Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara:
Subgoal Discovery for Hierarchical Dialogue Policy Learning. EMNLP 2018: 2298-2309 - [c66]Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu:
Data Poisoning Attacks in Contextual Bandits. GameSec 2018: 186-204 - [c65]Bo Dai, Albert E. Shaw, Niao He, Lihong Li, Le Song:
Boosting the Actor with Dual Critic. ICLR (Poster) 2018 - [c64]Yichen Chen, Lihong Li, Mengdi Wang:
Scalable Bilinear Learning Using State and Action Features. ICML 2018: 833-842 - [c63]Bo Dai, Albert E. Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, Le Song:
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation. ICML 2018: 1133-1142 - [c62]Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Xiaojin (Jerry) Zhu:
Adversarial Attacks on Stochastic Bandits. NeurIPS 2018: 3644-3653 - [c61]Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou:
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation. NeurIPS 2018: 5361-5371 - [c60]Jianfeng Gao, Michel Galley, Lihong Li:
Neural Approaches to Conversational AI. SIGIR 2018: 1371-1374 - [i45]Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara:
Subgoal Discovery for Hierarchical Dialogue Policy Learning. CoRR abs/1804.07855 (2018) - [i44]Yichen Chen, Lihong Li, Mengdi Wang:
Scalable Bilinear π Learning Using State and Action Features. CoRR abs/1804.10328 (2018) - [i43]Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu:
Data Poisoning Attacks in Contextual Bandits. CoRR abs/1808.05760 (2018) - [i42]Jianfeng Gao, Michel Galley, Lihong Li:
Neural Approaches to Conversational AI. CoRR abs/1809.08267 (2018) - [i41]Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Xiaojin Zhu:
Adversarial Attacks on Stochastic Bandits. CoRR abs/1810.12188 (2018) - [i40]Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou:
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation. CoRR abs/1810.12429 (2018) - [i39]Christoph Dann, Lihong Li, Wei Wei, Emma Brunskill:
Policy Certificates: Towards Accountable Reinforcement Learning. CoRR abs/1811.03056 (2018) - 2017
- [c59]Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng:
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access. ACL (1) 2017: 484-495 - [c58]Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong
:
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning. EMNLP 2017: 2231-2240 - [c57]Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli:
Neuro-Symbolic Program Synthesis. ICLR (Poster) 2017 - [c56]Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou:
Stochastic Variance Reduction Methods for Policy Evaluation. ICML 2017: 1049-1058 - [c55]Lihong Li, Yu Lu, Dengyong Zhou:
Provably Optimal Algorithms for Generalized Linear Contextual Bandits. ICML 2017: 2071-2080 - [c54]Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, Asli Celikyilmaz:
End-to-End Task-Completion Neural Dialogue Systems. IJCNLP(1) 2017: 733-743 - [c53]Jianshu Chen, Chong Wang, Lin Xiao, Ji He, Lihong Li, Li Deng:
Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes. NIPS 2017: 4977-4986 - [i38]Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou:
Stochastic Variance Reduction Methods for Policy Evaluation. CoRR abs/1702.07944 (2017) - [i37]Asli Celikyilmaz, Li Deng, Lihong Li, Chong Wang:
Scaffolding Networks for Teaching and Learning to Comprehend. CoRR abs/1702.08653 (2017) - [i36]Lihong Li, Yu Lu, Dengyong Zhou:
Provable Optimal Algorithms for Generalized Linear Contextual Bandits. CoRR abs/1703.00048 (2017) - [i35]Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao:
End-to-End Task-Completion Neural Dialogue Systems. CoRR abs/1703.01008 (2017) - [i34]Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, Asli Celikyilmaz:
Investigation of Language Understanding Impact for Reinforcement Learning Based Dialogue Systems. CoRR abs/1703.07055 (2017) - [i33]Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong:
Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning. CoRR abs/1704.03084 (2017) - [i32]Bo Dai, Albert E. Shaw, Niao He, Lihong Li, Le Song:
Boosting the Actor with Dual Critic. CoRR abs/1712.10282 (2017) - [i31]Bo Dai, Albert E. Shaw, Lihong Li, Lin Xiao, Niao He, Jianshu Chen, Le Song:
Smoothed Dual Embedding Control. CoRR abs/1712.10285 (2017) - 2016
- [j12]Katja Hofmann, Lihong Li, Filip Radlinski:
Online Evaluation for Information Retrieval. Found. Trends Inf. Retr. 10(1): 1-117 (2016) - [c52]Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf:
Deep Reinforcement Learning with a Natural Language Action Space. ACL (1) 2016 - [c51]Che-Yu Liu, Lihong Li:
On the Prior Sensitivity of Thompson Sampling. ALT 2016: 321-336 - [c50]Shipra Agrawal, Nikhil R. Devanur, Lihong Li:
An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives. COLT 2016: 4-18 - [c49]Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, Li Deng:
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads. EMNLP 2016: 1838-1848 - [c48]Nan Jiang, Lihong Li:
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. ICML 2016: 652-661 - [c47]Tzu-Kuo Huang, Lihong Li, Ara Vartanian, Saleema Amershi, Xiaojin Zhu:
Active Learning with Oracle Epiphany. NIPS 2016: 2820-2828 - [c46]Masrour Zoghi, Tomás Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, Maarten de Rijke:
Click-based Hot Fixes for Underperforming Torso Queries. SIGIR 2016: 195-204 - [i30]Ji He, Mari Ostendorf, Xiaodong He, Jianshu Chen, Jianfeng Gao, Lihong Li, Li Deng:
Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads. CoRR abs/1606.03667 (2016) - [i29]Zachary C. Lipton, Jianfeng Gao, Lihong Li, Xiujun Li, Faisal Ahmed, Li Deng:
Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking. CoRR abs/1608.05081 (2016) - [i28]Bhuwan Dhingra, Lihong Li, Xiujun Li, Jianfeng Gao, Yun-Nung Chen, Faisal Ahmed, Li Deng:
End-to-End Reinforcement Learning of Dialogue Agents for Information Access. CoRR abs/1609.00777 (2016) - [i27]Zachary C. Lipton, Jianfeng Gao, Lihong Li, Jianshu Chen, Li Deng:
Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear. CoRR abs/1611.01211 (2016) - [i26]Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli:
Neuro-Symbolic Program Synthesis. CoRR abs/1611.01855 (2016) - [i25]Xiujun Li, Zachary C. Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen:
A User Simulator for Task-Completion Dialogues. CoRR abs/1612.05688 (2016) - 2015
- [c45]Lihong Li, Rémi Munos, Csaba Szepesvári:
Toward Minimax Off-policy Value Estimation. AISTATS 2015 - [c44]Lihong Li, Jin Young Kim, Imed Zitouni:
Toward Predicting the Outcome of an A/B Experiment for Search Relevance. WSDM 2015: 37-46 - [c43]Lihong Li:
Offline Evaluation and Optimization for Interactive Systems. WSDM 2015: 413-414 - [c42]Lihong Li, Shunbao Chen, Jim Kleban, Ankur Gupta:
Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study. WWW (Companion Volume) 2015: 929-934 - [i24]Miroslav Dudík, Dumitru Erhan, John Langford, Lihong Li:
Doubly Robust Policy Evaluation and Optimization. CoRR abs/1503.02834 (2015) - [i23]Dragomir Yankov, Pavel Berkhin, Lihong Li:
Evaluation of Explore-Exploit Policies in Multi-result Ranking Systems. CoRR abs/1504.07662 (2015) - [i22]Shipra Agrawal, Nikhil R. Devanur, Lihong Li:
Contextual Bandits with Global Constraints and Objective. CoRR abs/1506.03374 (2015) - [i21]Che-Yu Liu, Lihong Li:
On the Prior Sensitivity of Thompson Sampling. CoRR abs/1506.03378 (2015) - [i20]Emma Brunskill, Lihong Li:
The Online Discovery Problem and Its Application to Lifelong Reinforcement Learning. CoRR abs/1506.03379 (2015) - [i19]Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He:
Recurrent Reinforcement Learning: A Hybrid Approach. CoRR abs/1509.03044 (2015) - [i18]Nan Jiang, Lihong Li:
Doubly Robust Off-policy Evaluation for Reinforcement Learning. CoRR abs/1511.03722 (2015) - [i17]Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf:
Deep Reinforcement Learning with an Unbounded Action Space. CoRR abs/1511.04636 (2015) - 2014
- [j11]Jiang Bian, Bo Long, Lihong Li, Taesup Moon, Anlei Dong, Yi Chang
:
Exploiting User Preference for Online Learning in Web Content Optimization Systems. ACM Trans. Intell. Syst. Technol. 5(2): 33:1-33:23 (2014) - [c41]Emma Brunskill, Lihong Li:
PAC-inspired Option Discovery in Lifelong Reinforcement Learning. ICML 2014: 316-324 - [c40]Alekh Agarwal, Daniel J. Hsu, Satyen Kale, John Langford, Lihong Li, Robert E. Schapire:
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. ICML 2014: 1638-1646 - [c39]Lihong Li, He He, Jason D. Williams
:
Temporal supervised learning for inferring a dialog policy from example conversations. SLT 2014: 312-317 - [i16]Alekh Agarwal, Daniel J. Hsu, Satyen Kale, John Langford, Lihong Li, Robert E. Schapire:
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. CoRR abs/1402.0555 (2014) - [i15]Lihong Li, Shunbao Chen, Jim Kleban, Ankur Gupta:
Counterfactual Estimation and Optimization of Click Metrics for Search Engines. CoRR abs/1403.1891 (2014) - [i14]Lihong Li, Rémi Munos, Csaba Szepesvári:
On Minimax Optimal Offline Policy Evaluation. CoRR abs/1409.3653 (2014) - 2013
- [c38]Emma Brunskill, Lihong Li:
Sample Complexity of Multi-task Reinforcement Learning. UAI 2013 - [i13]Emma Brunskill, Lihong Li:
Sample Complexity of Multi-task Reinforcement Learning. CoRR abs/1309.6821 (2013) - [i12]Lihong Li:
Generalized Thompson Sampling for Contextual Bandits. CoRR abs/1310.7163 (2013) - [i11]Zhen Qin, Vaclav Petricek, Nikos Karampatziakis, Lihong Li, John Langford:
Efficient Online Bootstrapping for Large Scale Learning. CoRR abs/1312.5021 (2013) - 2012
- [j10]John Langford, Lihong Li, R. Preston McAfee, Kishore Papineni:
Cloud control: voluntary admission control for intranet traffic management. Inf. Syst. E Bus. Manag. 10(3): 295-308 (2012) - [j9]Taesup Moon, Wei Chu, Lihong Li, Zhaohui Zheng, Yi Chang
:
An Online Learning Framework for Refining Recency Search Results with User Click Feedback. ACM Trans. Inf. Syst. 30(4): 20:1-20:28 (2012) - [c37]Miroslav Dudík, Dumitru Erhan, John Langford, Lihong Li:
Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits. UAI 2012: 247-254 - [c36]Vidhya Navalpakkam, Ravi Kumar, Lihong Li, D. Sivakumar:
Attention and Selection in Online Choice Tasks. UMAP 2012: 200-211 - [c35]Hongning Wang
, Anlei Dong, Lihong Li, Yi Chang
, Evgeniy Gabrilovich
:
Joint relevance and freshness learning from clickthroughs for news search. WWW 2012: 579-588 - [c34]Lihong Li, Wei Chu, John Langford, Taesup Moon, Xuanhui Wang:
Bandits with Generalized Linear Models. ICML On-line Trading of Exploration and Exploitation 2012: 19-36 - [c33]Lihong Li, Olivier Chapelle:
Open Problem: Regret Bounds for Thompson Sampling. COLT 2012: 43.1-43.3 - [p1]Lihong Li:
Sample Complexity Bounds of Exploration. Reinforcement Learning 2012: 175-204 - [i10]