default search action
Suvrit Sra
Person information
- affiliation: Massachusetts Institute of Technology (MIT), Laboratory for Information and Decision Systems, Cambridge, MA, USA
- affiliation: Max Planck Institute for Biological Cybernetics, Tübingen, Germany
- affiliation: University of Texas at Austin, Department of Computer Sciences, Austin, TX, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c108]Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra:
Linear attention is (maybe) all you need (to understand Transformer optimization). ICLR 2024 - [c107]Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra:
How to Escape Sharp Minima with Random Perturbations. ICML 2024 - [c106]Xiang Cheng, Yuxin Chen, Suvrit Sra:
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context. ICML 2024 - [i81]Xiang Cheng, Jingzhao Zhang, Suvrit Sra:
Efficient Sampling on Riemannian Manifolds via Langevin MCMC. CoRR abs/2402.10357 (2024) - [i80]Sanchayan Dutta, Xiang Cheng, Suvrit Sra:
Riemannian Bilevel Optimization. CoRR abs/2405.15816 (2024) - [i79]Guy Kornowski, Swati Padmanabhan, Kai Wang, Zhe Zhang, Suvrit Sra:
First-Order Methods for Linearly Constrained Bilevel Optimization. CoRR abs/2406.12771 (2024) - 2023
- [j20]Melanie Weber, Suvrit Sra:
Riemannian Optimization via Frank-Wolfe Methods. Math. Program. 199(1): 525-556 (2023) - [j19]Peiyuan Zhang, Jingzhao Zhang, Suvrit Sra:
Sion's Minimax Theorem in Geodesic Metric Spaces and a Riemannian Extragradient Algorithm. SIAM J. Optim. 33(4): 2885-2908 (2023) - [c105]Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra:
Toward Understanding State Representation Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control. CDC 2023: 6166-6171 - [c104]Derek Lim, Joshua David Robinson, Lingxiao Zhao, Tess E. Smidt, Suvrit Sra, Haggai Maron, Stefanie Jegelka:
Sign and Basis Invariant Networks for Spectral Graph Representation Learning. ICLR 2023 - [c103]Melanie Weber, Suvrit Sra:
Global optimality for Euclidean CCCP under Riemannian convexity. ICML 2023: 36790-36803 - [c102]David Xing Wu, Chulhee Yun, Suvrit Sra:
On the Training Instability of Shuffling SGD with Batch Normalization. ICML 2023: 37787-37845 - [c101]Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra:
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control? L4DC 2023: 51-63 - [c100]Yan Dai, Kwangjun Ahn, Suvrit Sra:
The Crucial Role of Normalization in Sharpness-Aware Minimization. NeurIPS 2023 - [c99]Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra:
Transformers learn to implement preconditioned gradient descent for in-context learning. NeurIPS 2023 - [i78]David Xing Wu, Chulhee Yun, Suvrit Sra:
On the Training Instability of Shuffling SGD with Batch Normalization. CoRR abs/2302.12444 (2023) - [i77]Yan Dai, Kwangjun Ahn, Suvrit Sra:
The Crucial Role of Normalization in Sharpness-Aware Minimization. CoRR abs/2305.15287 (2023) - [i76]Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra:
How to escape sharp minima. CoRR abs/2305.15659 (2023) - [i75]Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra:
Transformers learn to implement preconditioned gradient descent for in-context learning. CoRR abs/2306.00297 (2023) - [i74]Adarsh Barik, Suvrit Sra, Jean Honorio:
Invex Programs: First Order Algorithms and Their Convergence. CoRR abs/2307.04456 (2023) - [i73]Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra:
Linear attention is (maybe) all you need (to understand transformer optimization). CoRR abs/2310.01082 (2023) - [i72]Xiang Cheng, Yuxin Chen, Suvrit Sra:
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context. CoRR abs/2312.06528 (2023) - 2022
- [c98]Anshul Shah, Suvrit Sra, Rama Chellappa, Anoop Cherian:
Max-Margin Contrastive Learning. AAAI 2022: 8220-8230 - [c97]Jikai Jin, Suvrit Sra:
Understanding Riemannian Acceleration via a Proximal Extragradient Framework. COLT 2022: 2924-2962 - [c96]Chulhee Yun, Shashank Rajput, Suvrit Sra:
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond. ICLR 2022 - [c95]Kwangjun Ahn, Jingzhao Zhang, Suvrit Sra:
Understanding the unstable convergence of gradient descent. ICML 2022: 247-257 - [c94]Jingzhao Zhang, Haochuan Li, Suvrit Sra, Ali Jadbabaie:
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective. ICML 2022: 26330-26346 - [c93]Jingzhao Zhang, Hongzhou Lin, Subhro Das, Suvrit Sra, Ali Jadbabaie:
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity. ICML 2022: 26347-26361 - [c92]Horia Mania, Ali Jadbabaie, Devavrat Shah, Suvrit Sra:
Time Varying Regression with Hidden Linear Dynamics. L4DC 2022: 858-869 - [c91]Xiang Cheng, Jingzhao Zhang, Suvrit Sra:
Efficient Sampling on Riemannian Manifolds via Langevin MCMC. NeurIPS 2022 - [c90]Alp Yurtsever, Suvrit Sra:
CCCP is Frank-Wolfe in disguise. NeurIPS 2022 - [c89]Kwangjun Ahn, Suvrit Sra:
Understanding Nesterov's Acceleration via Proximal Point Method. SOSA 2022: 117-130 - [i71]Peiyuan Zhang, Jingzhao Zhang, Suvrit Sra:
Minimax in Geodesic Metric Spaces: Sion's Theorem and Algorithms. CoRR abs/2202.06950 (2022) - [i70]Derek Lim, Joshua Robinson, Lingxiao Zhao, Tess E. Smidt, Suvrit Sra, Haggai Maron, Stefanie Jegelka:
Sign and Basis Invariant Networks for Spectral Graph Representation Learning. CoRR abs/2202.13013 (2022) - [i69]Kwangjun Ahn, Jingzhao Zhang, Suvrit Sra:
Understanding the unstable convergence of gradient descent. CoRR abs/2204.01050 (2022) - [i68]Suvrit Sra, Melanie Weber:
On a class of geodesically convex optimization problems solved via Euclidean MM methods. CoRR abs/2206.11426 (2022) - [i67]Melanie Weber, Suvrit Sra:
Computing Brascamp-Lieb Constants through the lens of Thompson Geometry. CoRR abs/2208.05013 (2022) - [i66]Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra:
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control? CoRR abs/2212.14511 (2022) - 2021
- [c88]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? COLT 2021: 4653-4658 - [c87]Joshua David Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka:
Contrastive Learning with Hard Negative Samples. ICLR 2021 - [c86]Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar, Suvrit Sra:
Coping with Label Shift via Distributionally Robust Optimisation. ICLR 2021 - [c85]Yi Tian, Yuanhao Wang, Tiancheng Yu, Suvrit Sra:
Online Learning in Unknown Markov Games. ICML 2021: 10279-10288 - [c84]Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra:
Provably Efficient Algorithms for Multi-Objective Competitive RL. ICML 2021: 12167-12176 - [c83]Alp Yurtsever, Varun Mangalick, Suvrit Sra:
Three Operator Splitting with a Nonconvex Loss Function. ICML 2021: 12267-12277 - [c82]Joshua Robinson, Li Sun, Ke Yu, Kayhan Batmanghelich, Stefanie Jegelka, Suvrit Sra:
Can contrastive learning avoid shortcut solutions? NeurIPS 2021: 4974-4986 - [c81]Alp Yurtsever, Alex Gu, Suvrit Sra:
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates. NeurIPS 2021: 19743-19756 - [i65]Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra:
Provably Efficient Algorithms for Multi-Objective Competitive RL. CoRR abs/2102.03192 (2021) - [i64]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? CoRR abs/2103.07079 (2021) - [i63]Joshua Robinson, Li Sun, Ke Yu, Kayhan Batmanghelich, Stefanie Jegelka, Suvrit Sra:
Can contrastive learning avoid shortcut solutions? CoRR abs/2106.11230 (2021) - [i62]Jingzhao Zhang, Haochuan Li, Suvrit Sra, Ali Jadbabaie:
On Convergence of Training Loss Without Reaching Stationary Points. CoRR abs/2110.06256 (2021) - [i61]Chulhee Yun, Shashank Rajput, Suvrit Sra:
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond. CoRR abs/2110.10342 (2021) - [i60]Jikai Jin, Suvrit Sra:
A Riemannian Accelerated Proximal Extragradient Framework and its Implications. CoRR abs/2111.02763 (2021) - [i59]Anshul Shah, Suvrit Sra, Rama Chellappa, Anoop Cherian:
Max-Margin Contrastive Learning. CoRR abs/2112.11450 (2021) - 2020
- [j18]Ramkumar Hariharan, Johnna Sundberg, Giacomo Gallino, Ashley Schmidt, Drew Arenth, Suvrit Sra, Benjamin Fels:
An Interpretable Predictive Model of Vaccine Utilization for Tanzania. Frontiers Artif. Intell. 3: 559617 (2020) - [j17]Reshad Hosseini, Suvrit Sra:
An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization. Math. Program. 181(1): 187-223 (2020) - [c80]Florian Yger, Sylvain Chevallier, Quentin Barthélemy, Suvrit Sra:
Geodesically-convex optimization for averaging partially observed covariance matrices. ACML 2020: 417-432 - [c79]Kwangjun Ahn, Suvrit Sra:
From Nesterov's Estimate Sequence to Riemannian Acceleration. COLT 2020: 84-118 - [c78]Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie:
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity. ICLR 2020 - [c77]Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, Tiancheng Yu:
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition. ICML 2020: 4860-4869 - [c76]Joshua Robinson, Stefanie Jegelka, Suvrit Sra:
Strength from Weakness: Fast Learning Using Weak Supervision. ICML 2020: 8127-8136 - [c75]Jingzhao Zhang, Hongzhou Lin, Stefanie Jegelka, Suvrit Sra, Ali Jadbabaie:
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions. ICML 2020: 11173-11182 - [c74]Kwangjun Ahn, Chulhee Yun, Suvrit Sra:
SGD with shuffling: optimal rates without component convexity and large epoch requirements. NeurIPS 2020 - [c73]Yi Tian, Jian Qian, Suvrit Sra:
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes. NeurIPS 2020 - [c72]Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra:
Why are Adaptive Methods Good for Attention Models? NeurIPS 2020 - [i58]Jingzhao Zhang, Hongzhou Lin, Suvrit Sra, Ali Jadbabaie:
On Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions. CoRR abs/2002.04130 (2020) - [i57]Joshua Robinson, Stefanie Jegelka, Suvrit Sra:
Strength from Weakness: Fast Learning Using Weak Supervision. CoRR abs/2002.08483 (2020) - [i56]Kwangjun Ahn, Suvrit Sra:
On Tight Convergence Rates of Without-replacement SGD. CoRR abs/2004.08657 (2020) - [i55]Jingzhao Zhang, Hongzhou Lin, Subhro Das, Suvrit Sra, Ali Jadbabaie:
Stochastic Optimization with Non-stationary Noise. CoRR abs/2006.04429 (2020) - [i54]Yi Tian, Jian Qian, Suvrit Sra:
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes. CoRR abs/2006.13405 (2020) - [i53]Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, Stefanie Jegelka:
Contrastive Learning with Hard Negative Samples. CoRR abs/2010.04592 (2020) - [i52]Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar, Suvrit Sra:
Coping with Label Shift via Distributionally Robust Optimisation. CoRR abs/2010.12230 (2020) - [i51]Yi Tian, Yuanhao Wang, Tiancheng Yu, Suvrit Sra:
Provably Efficient Online Agnostic Learning in Markov Games. CoRR abs/2010.15020 (2020) - [i50]Horia Mania, Suvrit Sra:
Why do classifier accuracies show linear trends under distribution shift? CoRR abs/2012.15483 (2020)
2010 – 2019
- 2019
- [c71]Zelda Mariet, Mike Gartrell, Suvrit Sra:
Learning Determinantal Point Processes by Corrective Negative Sampling. AISTATS 2019: 2251-2260 - [c70]Jingzhao Zhang, Suvrit Sra, Ali Jadbabaie:
Acceleration in First Order Quasi-strongly Convex Optimization by ODE Discretization. CDC 2019: 1501-1506 - [c69]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Efficiently testing local optimality and escaping saddles for ReLU networks. ICLR (Poster) 2019 - [c68]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Small nonlinearities in activation functions create bad local minima in neural networks. ICLR (Poster) 2019 - [c67]Jeff Z. HaoChen, Suvrit Sra:
Random Shuffling Beats SGD after Finite Epochs. ICML 2019: 2624-2633 - [c66]Matthew Staib, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar, Suvrit Sra:
Escaping Saddle Points with Adaptive Gradient Methods. ICML 2019: 5956-5965 - [c65]Alp Yurtsever, Suvrit Sra, Volkan Cevher:
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator. ICML 2019: 7282-7291 - [c64]Joshua Robinson, Suvrit Sra, Stefanie Jegelka:
Flexible Modeling of Diversity with Strongly Log-Concave Distributions. NeurIPS 2019: 15199-15209 - [c63]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. NeurIPS 2019: 15532-15543 - [c62]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Are deep ResNets provably better than linear predictors? NeurIPS 2019: 15660-15669 - [i49]Matthew Staib, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar, Suvrit Sra:
Escaping Saddle Points with Adaptive Gradient Methods. CoRR abs/1901.09149 (2019) - [i48]Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie:
Analysis of Gradient Clipping and Adaptive Scaling with a Relaxed Smoothness Condition. CoRR abs/1905.11881 (2019) - [i47]Joshua Robinson, Suvrit Sra, Stefanie Jegelka:
Flexible Modeling of Diversity with Strongly Log-Concave Distributions. CoRR abs/1906.05413 (2019) - [i46]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Are deep ResNets provably better than linear predictors? CoRR abs/1907.03922 (2019) - [i45]Melanie Weber, Suvrit Sra:
Nonconvex stochastic optimization on manifolds via Riemannian Frank-Wolfe methods. CoRR abs/1910.04194 (2019) - [i44]Suvrit Sra:
Metrics Induced by Quantum Jensen-Shannon-Renyí and Related Divergences. CoRR abs/1911.02643 (2019) - [i43]Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra:
Why ADAM Beats SGD for Attention Models. CoRR abs/1912.03194 (2019) - 2018
- [j16]Álvaro Barbero Jiménez, Suvrit Sra:
Modular Proximal Optimization for Multidimensional Total-Variation Regularization. J. Mach. Learn. Res. 19: 56:1-56:82 (2018) - [c61]Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabás Póczos, Francis R. Bach, Ruslan Salakhutdinov, Alexander J. Smola:
A Generic Approach for Escaping Saddle points. AISTATS 2018: 1233-1242 - [c60]Suvrit Sra, Nisheeth K. Vishnoi, Ozan Yildiz:
On Geodesically Convex Formulations for the Brascamp-Lieb Constant. APPROX-RANDOM 2018: 25:1-25:15 - [c59]Hongyi Zhang, Suvrit Sra:
An Estimate Sequence for Geodesically Convex Optimization. COLT 2018: 1703-1723 - [c58]Anoop Cherian, Suvrit Sra, Stephen Gould, Richard Hartley:
Non-Linear Temporal Subspace Representations for Activity Recognition. CVPR 2018: 2197-2206 - [c57]Chengtao Li, David Alvarez-Melis, Keyulu Xu, Stefanie Jegelka, Suvrit Sra:
Distributional Adversarial Networks. ICLR (Workshop) 2018 - [c56]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Global Optimality Conditions for Deep Neural Networks. ICLR (Poster) 2018 - [c55]Jingzhao Zhang, Aryan Mokhtari, Suvrit Sra, Ali Jadbabaie:
Direct Runge-Kutta Discretization Achieves Acceleration. NeurIPS 2018: 3904-3913 - [c54]Zelda E. Mariet, Suvrit Sra, Stefanie Jegelka:
Exponentiated Strongly Rayleigh Distributions. NeurIPS 2018: 4464-4474 - [i42]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
A Critical View of Global Optimality in Deep Learning. CoRR abs/1802.03487 (2018) - [i41]Zelda Mariet, Mike Gartrell, Suvrit Sra:
Learning Determinantal Point Processes by Sampling Inferred Negatives. CoRR abs/1802.05649 (2018) - [i40]Anoop Cherian, Suvrit Sra, Stephen Gould, Richard Hartley:
Non-Linear Temporal Subspace Representations for Activity Recognition. CoRR abs/1803.11064 (2018) - [i39]Jingzhao Zhang, Aryan Mokhtari, Suvrit Sra, Ali Jadbabaie:
Direct Runge-Kutta Discretization Achieves Acceleration. CoRR abs/1805.00521 (2018) - [i38]Hongyi Zhang, Suvrit Sra:
Towards Riemannian Accelerated Gradient Methods. CoRR abs/1806.02812 (2018) - [i37]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Efficiently testing local optimality and escaping saddles for ReLU networks. CoRR abs/1809.10858 (2018) - [i36]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Finite sample expressive power of small-width ReLU networks. CoRR abs/1810.07770 (2018) - [i35]Jingzhao Zhang, Hongyi Zhang, Suvrit Sra:
R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate. CoRR abs/1811.04194 (2018) - [i34]Pourya Habib Zadeh, Reshad Hosseini, Suvrit Sra:
Deep-RBF Networks Revisited: Robust Classification with Rejection. CoRR abs/1812.03190 (2018) - 2017
- [j15]Anoop Cherian, Suvrit Sra:
Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices. IEEE Trans. Neural Networks Learn. Syst. 28(12): 2859-2871 (2017) - [c53]Ke Jiang, Suvrit Sra, Brian Kulis:
Combinatorial Topic Models using Small-Variance Asymptotics. AISTATS 2017: 421-429 - [c52]Zelda E. Mariet, Suvrit Sra:
Elementary Symmetric Polynomials for Optimal Experimental Design. NIPS 2017: 2139-2148 - [c51]Chengtao Li, Stefanie Jegelka, Suvrit Sra:
Polynomial time algorithms for dual volume sampling. NIPS 2017: 5038-5047 - [i33]Anoop Cherian, Suvrit Sra, Richard Hartley:
Sequence Summarization Using Order-constrained Kernelized Feature Subspaces. CoRR abs/1705.08583 (2017) - [i32]Reshad Hosseini, Suvrit Sra:
An Alternative to EM for Gaussian Mixture Models: Batch and Stochastic Riemannian Optimization. CoRR abs/1706.03267 (2017) - [i31]Chengtao Li, David Alvarez-Melis, Keyulu Xu, Stefanie Jegelka, Suvrit Sra:
Distributional Adversarial Networks. CoRR abs/1706.09549 (2017) - [i30]Chulhee Yun, Suvrit Sra, Ali Jadbabaie:
Global optimality conditions for deep neural networks. CoRR abs/1707.02444 (2017) - [i29]Mikhail A. Langovoy, Akhilesh Gotmare, Martin Jaggi, Suvrit Sra:
Unsupervised robust nonparametric learning of hidden community properties. CoRR abs/1707.03494 (2017) - [i28]Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabás Póczos, Francis R. Bach, Ruslan Salakhutdinov, Alexander J. Smola:
A Generic Approach for Escaping Saddle points. CoRR abs/1709.01434 (2017) - [i27]Melanie Weber, Suvrit Sra:
Frank-Wolfe methods for geodesically convex optimization with application to the matrix geometric mean. CoRR abs/1710.10770 (2017) - 2016
- [j14]Reshad Hosseini, Suvrit Sra, Lucas Theis, Matthias Bethge:
Inference and mixture modeling with the Elliptical Gamma Distribution. Comput. Stat. Data Anal. 101: 29-43 (2016) - [j13]Suvrit Sra:
On inequalities for normalized Schur functions. Eur. J. Comb. 51: 492-494 (2016) - [j12]Justin Solomon, Gabriel Peyré, Vladimir G. Kim, Suvrit Sra:
Entropic metric alignment for correspondence problems. ACM Trans. Graph. 35(4): 72:1-72:13 (2016) - [c50]Suvrit Sra, Adams Wei Yu, Mu Li, Alexander J. Smola:
AdaDelay: Delay Adaptive Distributed Stochastic Optimization. AISTATS 2016: 957-965 - [c49]