Home /  Workshop /  Schedules /  Mean field theory of neural networks: From stochastic gradient descent to Wasserstein gradient flows

Mean field theory of neural networks: From stochastic gradient descent to Wasserstein gradient flows

[Moved Online] Hot Topics: Optimal transport and applications to machine learning and statistics May 04, 2020 - May 08, 2020

May 08, 2020 (02:00 PM PDT - 03:00 PM PDT)
Speaker(s): Andrea Montanari (Stanford University)
Location: SLMath: Online/Virtual
Tags/Keywords
  • Neural networks

  • Mean field

  • Wasserstein gradient flow

Primary Mathematics Subject Classification
Secondary Mathematics Subject Classification
Video

Mean Field Theory Of Neural Networks: From Stochastic Gradient Descent To Wasserstein Gradient Flows

Abstract

Modern neural networks contain millions of parameters, and training them requires to optimize a highly non-convex objective. Despite the apparent complexity of this task, practitioners successfully train such models using simple first order methods such as stochastic gradient descent (SGD). I will survey recent efforts to understand this surprising phenomenon using tools from the theory of partial differential equations. Namely, I will discuss a mean field limit in which the number of neurons becomes large, and the SGD dynamics is approximated by a certain Wasserstein gradient flow. [Joint work with Adel Javanmard, Song Mei, Theodor Misiakiewicz, Marco Mondelli, Phan-Minh Nguyen]

Supplements No Notes/Supplements Uploaded
Video/Audio Files

Mean Field Theory Of Neural Networks: From Stochastic Gradient Descent To Wasserstein Gradient Flows

H.264 Video 928_28405_8336_Mean_Field_Theory_of_Neural_Networks-_From_Stochastic_Gradient_Descent_to_Wasserstein_Gradient_Flows.mp4
Troubles with video?

Please report video problems to itsupport@slmath.org.

See more of our Streaming videos on our main VMath Videos page.