by M. R. Karimi, Y. P. Hsieh, A. Krause
Abstract:
Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain some important challenges: Existing guarantees suffer from the drawback of lacking guarantees for the \emphlast-iterates. Moreover, little is known beyond the elementary schemes of stochastic gradient Langevin dynamics. To address these issues, we develop a novel framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as mirror Langevin, proximal, randomized mid-point, and Runge-Kutta integrators.
Reference:
A Dynamical System View of Langevin-Based Non-Convex Sampling M. R. Karimi, Y. P. Hsieh, A. KrauseIn Proc. of Thirty Seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
Bibtex Entry:
@inproceedings{karimi2022langevin. Moreover, little is known beyond the elementary schemes of stochastic gradient Langevin dynamics. To address these issues, we develop a novel framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as mirror Langevin, proximal, randomized mid-point, and Runge-Kutta integrators.},
author = {Karimi, Mohammad Reza and Hsieh, Ya-Ping and Krause, Andreas},
booktitle = {Proc. of Thirty Seventh Conference on Neural Information Processing Systems (NeurIPS)},
month = {December},
title = {A Dynamical System View of Langevin-Based Non-Convex Sampling},
year = {2023}}