


default search action
Zhao Song 0002
Person information
- affiliation: Adobe Research
- affiliation (former): Institute for Advanced Study, Princeton, NJ, USA
- affiliation (former): Princeton University, NJ, USA
- affiliation (former): University of Washington, DC, USA
- affiliation (PhD 2019): University of Texas at Austin, Department of Computer Science, USA
- affiliation (former): Harvard University, Cambridge, MA, USA
- affiliation (former): University of California Berkeley, CA, USA
- affiliation (former): Simon Fraser University, School of Computing Science, Burnaby, Canada
Other persons with the same name
- Zhao Song
- Zhao Song 0001
— Amazon AWS AI Labs, Santa Clara, CA, USA (and 2 more)
- Zhao Song 0003 — Iowa State University, Department of Electrical and Computer Engineering, Ames, IA, USA
- Zhao Song 0004 — Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, China (and 1 more)
- Zhao Song 0005 — Zhengzhou Institute of Aeronautical Industry Management, Henan, China
- Zhao Song 0006 — University of Missouri, Department of Computer Science, Columbia, USA
- Zhao Song 0007
— Dartmouth College, Department of Mathematics, Hanover, NH, USA (and 1 more)
- Zhao Song 0008 — Northwestern Polytechnical University, School of Mechanical Engineering, OPTIMAL, Xi'an, China
- Zhao Song 0009 — Munich University of Applied Sciences, Laboratory for Mechatronic and Renewable Energy Systems, Germany
- Zhao Song 0010 — Alibaba Group
- Zhao Song 0011 — Defense Innovation Institute, Beijing, China
- Zhao Song 0012 — Southern Medical University, Shenzhen Hospital, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [i209]Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song:
On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis. CoRR abs/2501.04377 (2025) - 2024
- [c121]Timothy Chu, Zhao Song, Chiwun Yang:
How to Protect Copyright Data in Optimization of Large Language Models? AAAI 2024: 17871-17879 - [c120]Zhao Song, Junze Yin, Lichen Zhang:
Solving Attention Kernel Regression Problem via Pre-conditioner. AISTATS 2024: 208-216 - [c119]Zhao Song, Junze Yin, Lichen Zhang, Ruizhe Zhang:
Fast Dynamic Sampling for Determinantal Point Processes. AISTATS 2024: 244-252 - [c118]Lianke Qin, Zhao Song, Ruizhe Zhang:
A General Algorithm for Solving Rank-one Matrix Sensing. AISTATS 2024: 757-765 - [c117]Josh Alman, Zhao Song:
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation. ICLR 2024 - [c116]Yeqi Gao, Lianke Qin, Zhao Song, Yitan Wang:
A Sublinear Adversarial Training Algorithm. ICLR 2024 - [c115]Yuzhou Gu, Zhao Song, Junze Yin, Lichen Zhang:
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time. ICLR 2024 - [c114]Jan van den Brand, Zhao Song, Tianyi Zhou:
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models. ICML 2024 - [c113]Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu:
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis. ICML 2024 - [c112]Zhao Song, Lichen Zhang, Ruizhe Zhang:
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time. ITCS 2024: 93:1-93:15 - [c111]Zhao Song, Ali Vakilian, David P. Woodruff, Samson Zhou:
On Socially Fair Low-Rank Approximation and Column Subset Selection. NeurIPS 2024 - [c110]Josh Alman, Zhao Song:
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models. NeurIPS 2024 - [c109]Timothy Chu, Josh Alman, Gary L. Miller, Shyam Narayanan, Mark Sellke, Zhao Song:
Metric Transforms and Low Rank Representations of Kernels for Fast Attention. NeurIPS 2024 - [c108]Yuzhou Gu, Nikki Lijing Kuang, Yian Ma, Zhao Song, Lichen Zhang:
Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk. NeurIPS 2024 - [c107]Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Sophia Pi, Zhao Song, Han Liu:
On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs). NeurIPS 2024 - [c106]Shuai Li, Zhao Song, Yu Xia, Tong Yu, Tianyi Zhou:
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression. NeurIPS 2024 - [c105]Haotian Jiang, Yin Tat Lee, Zhao Song, Lichen Zhang:
Convex Minimization with Integer Minima in Õ(n4) Time. SODA 2024: 3659-3684 - [i208]Yichuan Deng, Zhao Song, Chiwun Yang:
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence. CoRR abs/2402.01515 (2024) - [i207]Josh Alman, Zhao Song:
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models. CoRR abs/2402.04497 (2024) - [i206]Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu:
On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis. CoRR abs/2402.04520 (2024) - [i205]Yeqi Gao, Zhao Song, Ruizhe Zhang:
Quantum Speedup for Spectral Approximation of Kronecker Products. CoRR abs/2402.07027 (2024) - [i204]Jiuxiang Gu, Chenyang Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Tianyi Zhou:
Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic. CoRR abs/2402.09469 (2024) - [i203]Yichuan Deng, Zhao Song, Chiwun Yang:
Attention is Naturally Sparse with Gaussian Distributed Input. CoRR abs/2404.02690 (2024) - [i202]Zhihang Li, Zhao Song, Weixin Wang, Junze Yin, Zheng Yu:
How to Inverting the Leverage Score Distribution? CoRR abs/2404.13785 (2024) - [i201]Jiuxiang Gu, Chenyang Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond. CoRR abs/2405.03251 (2024) - [i200]Jiuxiang Gu, Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Junze Yin:
Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers. CoRR abs/2405.05219 (2024) - [i199]Yeqi Gao, Yuzhou Gu, Zhao Song:
Binary Hypothesis Testing for Softmax Models and Leverage Score Models. CoRR abs/2405.06003 (2024) - [i198]Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou:
Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers. CoRR abs/2405.16411 (2024) - [i197]Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou:
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective. CoRR abs/2405.16418 (2024) - [i196]Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu:
Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models. CoRR abs/2406.03136 (2024) - [i195]Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang:
Toward Infinite-Long Prefix in Transformer. CoRR abs/2406.14036 (2024) - [i194]Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu:
On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs). CoRR abs/2407.01079 (2024) - [i193]Jiuxiang Gu, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song:
Differential Privacy Mechanisms in Neural Tangent Kernel Regression. CoRR abs/2407.13621 (2024) - [i192]Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou:
Differential Privacy of Cross-Attention with Provable Guarantee. CoRR abs/2407.14717 (2024) - [i191]Jiuxiang Gu, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Junwei Yu:
Fast John Ellipsoid Computation with Differential Privacy Optimization. CoRR abs/2408.06395 (2024) - [i190]Chenyang Li, Zhao Song, Zhaoxing Xu, Junze Yin:
Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method. CoRR abs/2408.11267 (2024) - [i189]Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
A Tighter Complexity Analysis of SparseGPT. CoRR abs/2408.12151 (2024) - [i188]Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou:
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time. CoRR abs/2408.13233 (2024) - [i187]Xiaoyu Li, Zhao Song, Junwei Yu:
Quantum Speedups for Approximating the John Ellipsoid. CoRR abs/2408.14018 (2024) - [i186]Erzhi Liu, Jerry Yao-Chieh Hu, Alex Daniel Reneau, Zhao Song, Han Liu:
Differentially Private Kernel Density Estimation. CoRR abs/2409.01688 (2024) - [i185]Yuzhou Gu, Nikki Lijing Kuang, Yi-An Ma, Zhao Song, Lichen Zhang:
Log-concave Sampling over a Convex Body with a Barrier: a Robust and Unified Dikin Walk. CoRR abs/2410.05700 (2024) - [i184]Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou:
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers. CoRR abs/2410.09375 (2024) - [i183]Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou:
Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes. CoRR abs/2410.09397 (2024) - [i182]Bo Chen, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song:
HSR-Enhanced Sparse Attention Acceleration. CoRR abs/2410.10165 (2024) - [i181]Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Yufa Zhou:
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix. CoRR abs/2410.11261 (2024) - [i180]Bo Chen, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent. CoRR abs/2410.11268 (2024) - [i179]Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study. CoRR abs/2410.11279 (2024) - [i178]Majid Daliri, Zhao Song, Chiwun Yang:
Unlocking the Theory Behind Scaling 1-Bit Neural Networks. CoRR abs/2411.01663 (2024) - [i177]Jerry Yao-Chieh Hu, Erzhi Liu, Han Liu, Zhao Song, Lichen Zhang:
On Differentially Private String Distances. CoRR abs/2411.05750 (2024) - [i176]Bo Chen, Xiaoyu Li, Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song:
Circuit Complexity Bounds for RoPE-based Transformer Architecture. CoRR abs/2411.07602 (2024) - [i175]Jerry Yao-Chieh Hu, Wei-Po Wang, Ammar Gilani, Chenyang Li, Zhao Song, Han Liu:
Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency. CoRR abs/2411.16525 (2024) - [i174]Weimin Wu, Maojiang Su, Jerry Yao-Chieh Hu, Zhao Song, Han Liu:
Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training. CoRR abs/2411.16549 (2024) - [i173]Xiaoyu Li, Yuanpeng Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
On the Expressive Power of Modern Hopfield Networks. CoRR abs/2412.05562 (2024) - [i172]Yekun Ke, Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang:
Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond. CoRR abs/2412.06061 (2024) - [i171]Zhao Song, Ali Vakilian, David P. Woodruff, Samson Zhou:
On Socially Fair Low-Rank Approximation and Column Subset Selection. CoRR abs/2412.06063 (2024) - [i170]Yifang Chen, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity. CoRR abs/2412.06148 (2024) - [i169]Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A. Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu:
Numerical Pruning for Efficient Autoregressive Models. CoRR abs/2412.12441 (2024) - [i168]Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu:
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers. CoRR abs/2412.12444 (2024) - [i167]Yifang Chen, Jiayan Huo, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song:
Fast Gradient Computation for RoPE Attention in Almost Linear Time. CoRR abs/2412.17316 (2024) - [i166]Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Mingda Wan:
Theoretical Constraints on the Expressive Power of RoPE-based Tensor Attention Transformers. CoRR abs/2412.18040 (2024) - 2023
- [c104]Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan:
Smoothed Online Combinatorial Optimization Using Imperfect Predictions. AAAI 2023: 12130-12137 - [c103]Lianke Qin, Zhao Song, Lichen Zhang, Danyang Zhuo:
An Online and Unified Algorithm for Projection Matrix Vector Multiplication with Application to Empirical Risk Minimization. AISTATS 2023: 101-156 - [c102]Zhaozhuo Xu, Zhao Song, Anshumali Shrivastava:
A Tale of Two Efficient Value Iteration Algorithms for Solving Linear MDPs with Large Action Space. AISTATS 2023: 788-836 - [c101]Yichuan Deng, Yeqi Gao, Zhao Song:
Solving Tensor Low Cycle Rank Approximation. IEEE Big Data 2023: 6-16 - [c100]Lianke Qin, Aravind Reddy
, Zhao Song:
Online Adaptive Mahalanobis Distance Estimation. IEEE Big Data 2023: 56-65 - [c99]Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou:
Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training. IEEE Big Data 2023: 128-133 - [c98]Zhao Song, Baocheng Sun, Omri Weinstein, Ruizhe Zhang:
Quartic Samples Suffice for Fourier Interpolation. FOCS 2023: 1414-1425 - [c97]S. Cliff Liu, Zhao Song, Hengjie Zhang, Lichen Zhang, Tianyi Zhou:
Space-Efficient Interior Point Method, with Applications to Linear Programming and Maximum Weight Bipartite Matching. ICALP 2023: 88:1-88:14 - [c96]Xiaoxiao Li, Zhao Song, Jiaming Yang:
Federated Adversarial Learning: A Framework with Convergence Analysis. ICML 2023: 19932-19959 - [c95]Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Ré, Beidi Chen:
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time. ICML 2023: 22137-22176 - [c94]Zhao Song, Yitan Wang, Zheng Yu, Lichen Zhang:
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability. ICML 2023: 32365-32417 - [c93]Zhao Song, Xin Yang, Yuanyuan Yang, Lichen Zhang:
Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance. ICML 2023: 32418-32462 - [c92]Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang:
A Nearly-Optimal Bound for Fast Regression with ℓ∞ Guarantee. ICML 2023: 32463-32482 - [c91]Josh Alman, Zhao Song:
Fast Attention Requires Bounded Entries. NeurIPS 2023 - [c90]Josh Alman, Jiehao Liang, Zhao Song, Ruizhe Zhang, Danyang Zhuo:
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing. NeurIPS 2023 - [c89]Sudhanshu Chanpuriya, Ryan A. Rossi, Anup B. Rao, Tung Mai, Nedim Lipka, Zhao Song, Cameron Musco:
Exact Representation of Sparse Networks with Symmetric Nonnegative Embeddings. NeurIPS 2023 - [c88]Junda Wu, Tong Yu, Rui Wang, Zhao Song, Ruiyi Zhang, Handong Zhao, Chaochao Lu, Shuai Li, Ricardo Henao:
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding. NeurIPS 2023 - [c87]Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark W. Barrett, Zhangyang Wang, Beidi Chen:
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models. NeurIPS 2023 - [c86]Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, Huacheng Yu:
Towards Multi-Pass Streaming Lower Bounds for Optimal Approximation of Max-Cut. SODA 2023: 878-924 - [c85]Yaonan Jin, Daogao Liu, Zhao Song:
Super-resolution and Robust Sparse Continuous Fourier Transform in Any Constant Dimension: Nearly Linear Time and Sample Complexity. SODA 2023: 4667-4767 - [i165]Zhao Song, Tianyi Zhou:
Faster Sinkhorn's Algorithm with Small Treewidth. CoRR abs/2301.06741 (2023) - [i164]Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang:
A Nearly-Optimal Bound for Fast Regression with 𝓁∞ Guarantee. CoRR abs/2302.00248 (2023) - [i163]Yuzhou Gu, Zhao Song, Junze Yin, Lichen Zhang:
Low Rank Matrix Completion via Robust Alternating Minimization in Nearly Linear Time. CoRR abs/2302.11068 (2023) - [i162]Josh Alman, Zhao Song:
Fast Attention Requires Bounded Entries. CoRR abs/2302.13214 (2023) - [i161]Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang:
Streaming Kernel PCA Algorithm With Small Space. CoRR abs/2303.04555 (2023) - [i160]Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu:
A Theoretical Analysis Of Nearest Neighbor Search On Approximate Near Neighbor Graph. CoRR abs/2303.06210 (2023) - [i159]Yichuan Deng, Zhihang Li, Zhao Song:
An Improved Sample Complexity for Rank-1 Matrix Sensing. CoRR abs/2303.06895 (2023) - [i158]Lianke Qin, Zhao Song, Ruizhe Zhang:
A General Algorithm for Solving Rank-one Matrix Sensing. CoRR abs/2303.12298 (2023) - [i157]Zhihang Li, Zhao Song, Tianyi Zhou:
Solving Regularized Exp, Cosh and Sinh Regression Problems. CoRR abs/2303.15725 (2023) - [i156]Yeqi Gao, Sridhar Mahadevan, Zhao Song:
An Over-parameterized Exponential Regression. CoRR abs/2303.16504 (2023) - [i155]Jan van den Brand, Zhao Song, Tianyi Zhou:
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models. CoRR abs/2304.02207 (2023) - [i154]Haotian Jiang, Yin Tat Lee, Zhao Song, Lichen Zhang:
Convex Minimization with Integer Minima in Õ(n4) Time. CoRR abs/2304.03426 (2023) - [i153]Yichuan Deng, Sridhar Mahadevan, Zhao Song:
Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension. CoRR abs/2304.04397 (2023) - [i152]Yichuan Deng, Yeqi Gao, Zhao Song:
Solving Tensor Low Cycle Rank Approximation. CoRR abs/2304.06594 (2023) - [i151]Yichuan Deng, Zhihang Li, Zhao Song:
Attention Scheme Inspired Softmax Regression. CoRR abs/2304.10411 (2023) - [i150]Shuai Li, Zhao Song, Yu Xia
, Tong Yu, Tianyi Zhou:
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression. CoRR abs/2304.13276 (2023) - [i149]Yeqi Gao, Zhao Song, Junze Yin:
An Iterative Algorithm for Rescaled Hyperbolic Functions Regression. CoRR abs/2305.00660 (2023) - [i148]Yeqi Gao, Zhao Song, Xin Yang:
Differentially Private Attention Computation. CoRR abs/2305.04701 (2023) - [i147]Zhao Song, Mingquan Ye:
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data. CoRR abs/2305.08001 (2023) - [i146]Zhao Song, Weixin Wang, Chenbo Yin:
Fast and Efficient Matching Algorithm with Deadline Instances. CoRR abs/2305.08353 (2023) - [i145]Lianke Qin, Zhao Song, Yitan Wang:
Fast Submodular Function Maximization. CoRR abs/2305.08367 (2023) - [i144]Song Bian, Zhao Song, Junze Yin:
Federated Empirical Risk Minimization via Second-Order Method. CoRR abs/2305.17482 (2023) - [i143]Yichuan Deng, Zhao Song, Junze Yin:
Faster Robust Tensor Power Method for Arbitrary Order. CoRR abs/2306.00406 (2023) - [i142]Ritwik Sinha, Zhao Song, Tianyi Zhou:
A Mathematical Abstraction for Balancing the Trade-off Between Creativity and Reality in Large Language Models. CoRR abs/2306.02295 (2023) - [i141]Xiaoxiao Li, Zhao Song, Guangyi Zhang:
Sparse Convolution for Approximate Sparse Instance. CoRR abs/2306.02381 (2023) - [i140]Xiang Chen, Zhao Song, Baocheng Sun, Junze Yin, Danyang Zhuo:
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis. CoRR abs/2306.03356 (2023) - [i139]Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang:
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation. CoRR abs/2306.04169 (2023) - [i138]Junda Wu, Tong Yu, Rui Wang, Zhao Song, Ruiyi Zhang, Handong Zhao, Chaochao Lu, Shuai Li, Ricardo Henao:
InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding. CoRR abs/2306.04933 (2023) - [i137]Yichuan Deng, Zhao Song, Lichen Zhang, Ruizhe Zhang:
Efficient Algorithm for Solving Hyperbolic Programs. CoRR abs/2306.07587 (2023) - [i136]Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark W. Barrett, Zhangyang Wang, Beidi Chen:
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models. CoRR abs/2306.14048 (2023) - [i135]Yeqi Gao, Zhao Song, Shenghao Xie:
In-Context Learning for Attention Scheme: from Single Softmax Regression to Multiple Softmax Regression via a Tensor Trick. CoRR abs/2307.02419 (2023) - [i134]Lianke Qin, Zhao Song, Yuanyuan Yang:
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification. CoRR abs/2307.06565 (2023) - [i133]Yuzhou Gu, Zhao Song, Lichen Zhang:
A Nearly-Linear Time Algorithm for Structured Support Vector Machines. CoRR abs/2307.07735 (2023) - [i132]Yeqi Gao, Zhao Song, Xin Yang, Ruizhe Zhang:
Fast Quantum Algorithm for Attention Computation. CoRR abs/2307.08045 (2023) - [i131]Yichuan Deng, Zhihang Li, Sridhar Mahadevan, Zhao Song:
Zero-th Order Algorithm for Softmax Attention Optimization. CoRR abs/2307.08352 (2023) - [i130]Yichuan Deng, Zhao Song, Shenghao Xie:
Convergence of Two-Layer Regression with Nonlinear Units. CoRR abs/2308.08358 (2023) - [i129]Yeqi Gao, Zhao Song, Junze Yin:
GradientCoin: A Peer-to-Peer Decentralized Large Language Models. CoRR abs/2308.10502 (2023) - [i128]Yichuan Deng, Michalis Mamakos, Zhao Song:
Clustered Linear Contextual Bandits with Knapsacks. CoRR abs/2308.10722 (2023) - [i127]Timothy Chu, Zhao Song, Chiwun Yang:
How to Protect Copyright Data in Optimization of Large Language Models? CoRR