D’ASCOLI Stéphane

PhD student

L’Ecole normale supérieure - PSL / FAIR Paris

stephane.dascoli [at] gmail.com

Short bio

Master in Theoretical Physics, ENS Paris

Thesis title

Deep learning: from toy models to modern architectures.

Short abstract

My research focuses on understanding how deep neural networks are able to generalize despite being heavily overparametrized. On one hand, I use tools from statistical mechanics to study simple models, and try to understand when and why they overfit. On the other hand, I investigate how different types of inductive biases affect learning, from fully-connected networks to convolutional networks to transformers.