Short bio
Master in Theoretical Physics, ENS Paris
Thesis title
Deep learning: from toy models to modern architectures.
Short abstract
My research focuses on understanding how deep neural networks are able to generalize despite being heavily overparametrized. On one hand, I use tools from statistical mechanics to study simple models, and try to understand when and why they overfit. On the other hand, I investigate how different types of inductive biases affect learning, from fully-connected networks to convolutional networks to transformers.