We propose a novel approach to analyze the generalization error of the stochastic gradient Langevin dynamics (SGLD) algorithm, a popular alternative to stochastic gradient descent. Discrete-time algorithms such as SGLD typically do not admit an explicit formula for their (time-marginal) distributions, making theoretical analysis very difficult. Previous non-asymptotic generalization bounds for SGLD used the distribution associated to the continuous-time Langevin diffusion as an approximation. However, the approximation error is at best order one in step size, and these bounds either suffer from a slow convergence rate or implicit conditions on the step size. In this paper, we construct a high order approximation framework with time independent error using weak backward error analysis. We then provide a non-asymptotic generalization bound for SGLD, with explicit and less restrictive conditions on the step size.
Bibtex
@inproceedings{LiGazeau19,
title = {Uniform Stability and High Order Approximation of SGLD in Non-Convex Learning},
author = {Mufan Li and Maxime Gazeau},
booktitle = {International Conference on Machine Learning (ICML workshop on Understanding and Improving Generalization in Deep Learning)},
year = 2019,
}
Related Research
-
Borealis AI at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future
Borealis AI at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future
Learning And Generalization; Natural Language Processing; Time series Modelling
Research
-
Few-Shot Learning & Meta-Learning | Tutorial
Few-Shot Learning & Meta-Learning | Tutorial
W. Zi, L. S. Ghoraie, and S. Prince.
Research
-
Self-supervised Learning in Time-Series Forecasting — A Contrastive Learning Approach
Self-supervised Learning in Time-Series Forecasting — A Contrastive Learning Approach
T. Sylvain, L. Meng, and A. Lehrmann.
Research