洗牌
趋同(经济学)
应用数学
数理经济学
数学优化
数学
计量经济学
计算机科学
经济
统计
经济增长
作者
Zijian Liu,Zhengyuan Zhou
出处
期刊:Cornell University - arXiv
日期:2024-03-12
标识
DOI:10.48550/arxiv.2403.07723
摘要
Shuffling gradient methods, which are also known as stochastic gradient descent (SGD) without replacement, are widely implemented in practice, particularly including three popular algorithms: Random Reshuffle (RR), Shuffle Once (SO), and Incremental Gradient (IG). Compared to the empirical success, the theoretical guarantee of shuffling gradient methods was not well-understanding for a long time. Until recently, the convergence rates had just been established for the average iterate for convex functions and the last iterate for strongly convex problems (using squared distance as the metric). However, when using the function value gap as the convergence criterion, existing theories cannot interpret the good performance of the last iterate in different settings (e.g., constrained optimization). To bridge this gap between practice and theory, we prove last-iterate convergence rates for shuffling gradient methods with respect to the objective value even without strong convexity. Our new results either (nearly) match the existing last-iterate lower bounds or are as fast as the previous best upper bounds for the average iterate.
科研通智能强力驱动
Strongly Powered by AbleSci AI