logo IMB
Retour

Séminaire Images Optimisation et Probabilités

Second-order algorithms for large-scale optimization and deep learning

Camille Castera

( University of Tübingen )

Salle de conférénces

le 14 décembre 2023 à 11:00

Non-convex non-smooth optimization has gained a lot of interest due to the efficiency of neural networks in many practical applications and the need to "train" them. Training amounts to solving very large-scale optimization problems. In this context, standard algorithms almost exclusively rely on inexact (sub-)gradients through automatic differentiation and mini-batch sub-sampling. As a result, first-order methods (SGD, ADAM, etc.) remain the most used ones to train neural networks.
Driven by a dynamical system approach, we build INNA, an inertial and Newtonian algorithm, exploiting second-order information on the function only by means of first-order automatic differentiation and mini-batch sub-sampling. By analyzing together the dynamical system and INNA, we prove the almost-sure convergence of the algorithm. We discuss practical considerations and empirical results on deep learning experiments.
We finally depart from non-smooth optimization and provide insights into recent results that pave the way for designing faster second-order methods.