Home // Events // Summer Schools // Summer School 2024 // Prof. Guido Montúfar

Speaker

Prof. Guido Montúfar

Department of Mathematics and Department of Statistics & Data Science

UCLA - University of California, Los Angeles

montufar@math.ucla.edu

Prof. Guido Montúfar

At the 10th International Summer School on AI and Big Data, Prof. Guido Montúfar will talk about Biases of gradient optimization in neural networks.

Talk: Biases of gradient optimization in neural networks

This lecture discusses some of the mechanisms biasing the optimization of overparametrized neural networks towards simpler solutions. In particular, we take a look at parameter optimization in ReLU networks and show gradient descent training is biased towards smooth solution functions. The specific form of this bias depends on the network architecture and activation function as well as on the particular form of the gradient optimization procedure and initialization.

We then consider quantitative bounds measuring the difference in function space between the trajectory of a finite-width network trained on finitely many samples from an idealized kernel dynamics. These results show that the network is biased to learn the top eigenfunctions of a neural tangent kernel integral operator not just on the training set but over the entire input space. The discussion is based on works with Hui Jin and Benjamin Bowman.

Bio

Prof. Guido Montúfar is an Associate Professor of Mathematics and Statistics & Data Science at UCLA. He is also the leader of the Math Machine Learning Group at the Max Planck Institute for Mathematics in the Sciences. His research focuses on the mathematical foundations of machine learning and specifically deep learning theory. He studied mathematics and theoretical physics at TU Berlin and obtained the Dr.rer.nat. in Mathematics in 2012 as an IMPRS fellow in Leipzig. His work has been recognized with awards from the ERC, DFG, NSF. Guido Montufar is a 2022 Alfred P. Sloan Research Fellow.

funded by:

Gefördert vom Bundesministerium für Bildung und Forschung.

ScaDS.AI Dresden/Leipzig (Center for Scalable Data Analytics and Artificial Intelligence) is a center for Data Science, Artificial Intelligence and Big Data with locations in Dresden and Leipzig.