# Mean field theory of neural networks: From stochastic gradient descent to Wasserstein gradient flows

## [Moved Online] Hot Topics: Optimal transport and applications to machine learning and statistics May 04, 2020 - May 08, 2020

**Speaker(s):**Andrea Montanari (Stanford University)

**Location:**SLMath: Online/Virtual

**Tags/Keywords**

Neural networks

Mean field

Wasserstein gradient flow

**Primary Mathematics Subject Classification**

**Secondary Mathematics Subject Classification**

#### Mean Field Theory Of Neural Networks: From Stochastic Gradient Descent To Wasserstein Gradient Flows

Modern neural networks contain millions of parameters, and training them requires to optimize a highly non-convex objective. Despite the apparent complexity of this task, practitioners successfully train such models using simple first order methods such as stochastic gradient descent (SGD). I will survey recent efforts to understand this surprising phenomenon using tools from the theory of partial differential equations. Namely, I will discuss a mean field limit in which the number of neurons becomes large, and the SGD dynamics is approximated by a certain Wasserstein gradient flow. [Joint work with Adel Javanmard, Song Mei, Theodor Misiakiewicz, Marco Mondelli, Phan-Minh Nguyen]

#### Mean Field Theory Of Neural Networks: From Stochastic Gradient Descent To Wasserstein Gradient Flows

H.264 Video | 928_28405_8336_Mean_Field_Theory_of_Neural_Networks-_From_Stochastic_Gradient_Descent_to_Wasserstein_Gradient_Flows.mp4 |

Please report video problems to itsupport@slmath.org.

**See more of our Streaming videos on our main
VMath Videos page.**