VESTERGAARD Christian

Postdoc

CARPENTIER Justin

Robotics

Inria / L’Ecole normale supérieure - PSL

justin.carpentier [at] inria.fr

Short bio

Justin Carpentier is a Researcher at Inria Paris within the WILLOW team. His research lies at the interface of Learning, Perception and Control for Robotics. From 2018 to 2019, he was a postdoctoral research associate in the same WILLOW team. He obtained a PhD degree from the University of Toulouse in Robotics in 2017.

Topics of interest

Robotics, Control, Vision, Optimization and Machine Learning.

Project in Prairie

The main scientific objective of Justin’s project within PRAIRIE is to lay the mathematical and algorithmic foundations to enable robotic systems (i) to learn precise model of their dynamics and their interactions and (ii) to exploit these learned models inside advanced control schemes in order to precisely achieve dynamic motions and solving complex tasks, all with an advanced level of autonomy and agility.

Quote

Despite major progress in mechatronics, planning, automatic control and perception over the past decades, current robots remain in overall limited in their capacity to comprehend and control the heterogeneous set of interactions with their environment. The overall ambition of Justin’s research is to enhance the robot capacities to precisely and safely interact with their surroundings in order to achieve fine manipulation gestures and agile locomotion, by leveraging recent progresses made in Vision, Machine Learning, Optimization and Control.

Team

ROYER Clément

Dauphine - PSL

clement.royer [at] dauphine.psl.eu

Twitter

Short bio

Clément Royer is an associate professor of computer science at Université Paris Dauphine-PSL and a researcher in the MILES team at LAMSADE. From 2016 to 2019, he was a postdoctoral research associate at the Wisconsin Institute of Discovery, University of Wisconsin-Madison, USA. He received his Ph.D. in applied mathematics from the University of Toulouse, France, in 2016. Clément is a recipient of the COAP Best Paper Prize for 2019.

Topics of interest

Numerical optimization, Optimization for machine learning, Randomized algorithms.

Project in Prairie

As the amount of data available and the complexity of the models keep increasing, a number of issues arise in deploying optimization techniques for artificial intelligence at scale. Such challenges have long been integrated in high-performance computing, where the combination of optimization with other fields from numerical linear algebra to differential equations has led to powerful algorithms. This project aims at adopting a similar approach with optimization methods for data science.

Quote

My research aims at developing optimization methods for artificial intelligence that leverage existing methodology and advances from scientific computing along two axes. On one hand, we motivate the use of standard algorithmic frameworks for scientific computing in modern learning tasks by proposing practical schemes with complexity guarantees. Our research will aim at analyzing the complexity of classical second-order methods used in scientific computing so as to design frameworks with theoretical grounds and practical appeal for artificial intelligence. On the other hand, we develop derivative-free algorithms for automated parameter tuning of complex data science models. Our setting will be that of expensive, black-box systems for which a number of parameters require calibration.

Team

GOYENS Florentin

Postdoctoral researcher

Natural language processing

Postdoc

OLYMPUS DIGITAL CAMERA

BAWDEN Rachel

Inria

rachel.bawden [at] inria.fr

Twitter

Short bio

Researcher (Chargée de recherches) at Inria in the ALMAnaCH project-team since 2020. Previously obtained a PhD from Université Paris-Sud (awarded the ATALA thesis prize) and spent 2 years as a postdoc in the Machine Translation group at the University of Edinburgh.

Topics of interest

Natural language processing, multilinguality, machine translation

Project in Prairie

Rachel Bawden will focus on improving Machine Translation in the face of language variation (texts from different domains, user-generated texts and historical language). Alongside the development of models, she will also explore the interpretability of models in a bid to make them more robust to variation. Finally, she will experiment with the integration of other input modalities (e.g. image and video data), to help tackle ambiguity and scenarios for which the input signal is impoverished or incomplete.

Quote

Huge progress has been seen in Machine Translation in recent years. However, the translation of domain-specific texts (e.g. biomedical and financial), those displaying a high degree of language variation (e.g. social media texts containing spelling errors, acronyms and marks of expressiveness) and other non-standard varieties of language (including dialects and old languages) remains a challenge. Developing models that (i) are robust to variation, (ii) are able to handle the low-resource settings that these scenarios often present and (iii) can incorporate all external context is therefore fundamental to progress in Machine Translation.

Team

NISHIMWE Lydia

PhD student

PhD student

Matthieu FUTERAL-PETER

PhD student

Laura CANTINI CNRS Research Scientist (Chargé de Recherches) at IBENS, specialized in multi-omics data integration in bulk and single-cell data

CANTINI Laura

Genomics

CNRS / Institut Pasteur

laura.cantini [at] pasteur.fr

Twitter

Short bio

Young PI (G5) in Institut Pasteur and CNRS permanent researcher. Her research activity is focused on the design of machine learning methods for the integration of single-cell multi-modal data. Mathematician by training, Laura received her PhD in cancer systems biology from the University of Turin (Italy). She then pursued a postdoc in the cancer system biology group at Institut Curie (Paris). In 2018, awarded the L’Oréal-UNESCO for Women in Science and EMBO fellowship, she joined CSAIL at MIT (USA), before being selected as CNRS permanent researcher. She is recipient of the ERC StG 2023, ANR JCJC 2020, Sanofi iTech Awards 2020 and L’Oréal-UNESCO for Women in Science fellow (2018 edition).

Topics of interest

Single-cell omics data, multi-modal integration, network inference

Project in Prairie

Laura Cantini will develop computational methods for multi-modal single-cell data integration. She will in particular combine multi-omics joint dimensionality reduction, to identify the cell types and states present in a biological sample, and network-based methods to reconstruct the multi-omics regulatory mechanisms underlying each cell type/state. Finally, by applying the developed approaches to patient-derived data, she will contribute to improve our understanding of cancer heterogeneity and its underlying molecular mechanisms.

Quote

The timely detection and successful treatment of cancer depends on our ability to understand when, why, and how a subpopulation of cells deviates away from a healthy state or acquires drug resistance. Single-cell multi-modal data, produced at increasing peace, offer the opportunity to tackle these questions. The current major bottleneck is the crucial need for computational methods able to translate this wealth of information into actionable biological knowledge.

Team

SAMARAN Jules

PhD student

Natural language processing

PhD student

LASRI Karim

PhD Student

L’Ecole normale supérieure - PSL

karim.lasri [at] ens.fr

Short bio

Engineer’s degree at Ecole CentraleSupélec (former Ecole Centrale Paris)

Master’s degree in Cognitive Science at the Ecole Normale Supérieure

Thesis title

Linguistic generalization in transformer-based neural language models.

Short abstract

Transformer-based neural architectures bear lots of promises as they seem to address a wide range of linguistic tasks after learning a language model. However, the level of abstraction they reach after their training is still opaque. My main research focus is understanding better how neural language models generalize. What linguistic properties do these architectures acquire during learning ? How is linguistic information encoded in their intermediate representation spaces?

DO Virginie

PhD Student

Autonomous agents and multi-agent systems

Dauphine - PSL

virginie.do [at] dauphine.eu

Short bio

MSc in Applied Mathematics / Diplôme d’Ingénieur – Ecole Polytechnique

MSc in Social Data Science – University of Oxford

Thesis title

Fairness in machine learning: insights from social choice.

Short abstract

Designing fair algorithms has recently appeared as a major issue in machine learning, and more generally in AI, while it has been studied for long in economics, especially in social choice theory. My goal is to bring together the notions of fairness of the two communities, and leverage the concepts and mathematical tools of social choice to address the new challenges of fairness in machine learning.

AYADI Manel

Postdoctoral researcher

Dauphine - PSL

manel.ayadi [at] dauphine.eu

Short bio

PhD in Computer Science at LAMSADE – Paris-Dauphine

Research project

How does changing the voting system and the electoral district boundaries impact the outcome of the French legislative elections?

Short abstract

The aim of the project is to study the impact of changing the voting system (mixed electoral system, proportional representation …) and the electoral district boundaries on the outcome of the French legislative elections of 2017.

BARRÉ Chloé

Postdoctoral researcher

Statistical physics

Institut Pasteur

chloe.barre [at] pasteur.fr

Short bio

PhD, LPTMC (Laboratoire de Physique Théorique de la Matière Condensée), Sorbonne University, Paris

Research project

Bayesian induction of the behavior of the larva.

Short abstract

Making decisions is a fundamental feature of animal behavior. Nevertheless, there remains a large knowledge gap in linking neural architecture and behavioral response. To bridge this gap, targeting individual neurons and having a simple read-out of their activity is crucial, and Drosophila larvae are ideal organisms for such an approach. My work is part of a larger project to explore the relationship between neural network dynamics and decision making in Drosophila larvae. I use Bayesian induction techniques and physical modeling to understand this relationship.

By combining video measurement experiments of larval behavior with advances in modern optogenetics that allow the activation/inactivation of individual neurons, a database of millions of larvae responding to the activation of single neurons has been constructed. Although a machine learning approach that projects larval videos into complex behavioral dictionaries has been developed, some images remain ambiguous and the corresponding behavior is therefore poorly detected. To improve behavior detection we describe the shape of the larva using insights from solids mechanics. Using this physical model, we perform a Bayesian induction to find parameters that describe the behavior of the larvae in a more robust way.

Once the behaviors are properly detected and quantified, we want to detect all possible responses and modulations induced by the activation or inactivation of a neuron. We have written a simplified model that describes the dynamics and sequences of behaviors. With Bayesian inference I learn the parameters of my model and with a generative model and theses parameters I can recreate virtual larvae. These virtual larvae made it possible to separate neural responses between those provoking simple and immediate actions from those generating complex behaviors. It is thus possible to group neurons in terms of response.

By combining the techniques of biologists with probabilistic analysis techniques (including Bayesian inference), we can identify behavioral changes due to the activation/inactivation of neurons and thus will allow us to infer causal relationships between neural activity and behavioral patterns, and uncover how behavior emerges from activity in the connectome.

ANDRAL Charly

PhD Student

Data science

Dauphine - PSL

andral [at] ceremade.dauphine.fr

Short bio

Diplome d’ingénieur – ENSAE Paris

Master Statistics And Machine Learning, Paris Saclay University

Thesis title

Improvement of MCMC methods and adaptation to the Big Data.

Short abstract

MCMC methods can have some difficulties exploring space, especially in high dimensional settings that can occur in a context of Big Data. The goal of my PhD thesis is to find enhancements to MCMC about this exploring issue.

GODARD Charlotte

PhD Student

Biomedical Imaging

Institut Pasteur

charlotte.godard [at] pasteur.fr

Short bio

Engineer degree – Telecm Physique Strasbourg
Master degree in Imaging, Robotics and Engineering for Healthcare – University of Strasbourg

Thesis title

Semi-automatic and amortized developments of transfer function for surgery planning in virtual reality.

Short abstract

Interpretation of medical images, such as MRI or CT-scan, can be challenging for a non-radiologist expert because of the various image quality and of the similarities between different structures of interest. However, surgeons need to understand these images to prepare surgeries and define corresponding anatomical landmarks. As universal segmentation is not possible due to the diversity of images between patients, we focused on the optimization of the visualization process applied only on the raw data. The AVATAR MEDICAL platform uses virtual reality for an intuitive visualization and manipulation of the images. Visualization parameters (color, transparency) are currently defined manually using an user-friendly transfer function desktop interface. The objective of the thesis is to automate the transfer function generation for a faster isolation of the structures of interest in the image, by combining a statistical approach and pre-trained models.

MISHRA Shrey

PhD student

L’Ecole normale supérieure - PSL

Shrey.Mishra [at] ens.fr

Short bio

Manipal University (India, BTech)
Cesi school of Engineering (Software majors, Ecole de engineer)
Munster Technological University (MSc Artificial Intelligence)

Thesis title

Extracting information related to the Scientific Articles published and making a knowledge base out of it, with the application of various AI / Machine learning based techniques.

Short abstract

Every years thousand’s of scientific papers are published in the academia covering various scientific proofs theorems and relations in a form of a Pdf document. I am enrolled in a TheoremKb (A project led by Pierre Senellart) to extract information from the scientific articles while training Machine learning models to identify / relate various documents together based upon the information expressed in the article (including the mathematical proof’s).

KMETZSCH Virgilio

PhD student

Genomics

INRIA

virgilio.kmetzsch [at] inria.fr

Short bio

MSc in Data Science – Grenoble INP Ensimag & UGA

Thesis title

Multimodal analysis of neuroimaging and transcriptomic data in genetic frontotemporal dementia.

Short abstract

Frontotemporal dementia (FTD) is a devastating neurodegenerative disease with no effective treatments so far. The Paris Brain Institute has assembled one of the largest cohorts worldwide on genetic forms of FTD, comprising multimodal data including neuroimaging (MRI, PET), cognition and transcriptomic (RNA-seq). The present PhD project aims at designing and applying new approaches for integrating multimodal transcriptomic and neuroimaging data, to characterize biomarkers of the presymptomatic phase of the disease, in order to design upcoming therapeutic trials.

TEBOUL Raphaël

PhD Student

Genomics

INSERM

raphael.teboul [at] inserm.fr

Short bio

Master degree in Engineering at Telecom Paris

Thesis title

Unravelling non-coding driver alterations in cancer with deep learning.

Short abstract

Of the 3 gigabases that constitute the human genome, only about 50 megabases (<2%) encode protein-coding genes. Particular attention has been paid to somatic mutations affecting the coding sequence of these genes, leading to the almost exhaustive characterization of 723 genes implicated in cancer (cancer gene census, COSMIC database, September 2019). By contrast, at the notable exception of TERT promoter mutations that induce the expression of telomerase (a key enzyme necessary for unlimited cell proliferation), very few driver alterations have been identified in the non-coding genome. Analysis of mutation hotspots or known regulatory regions like promoters and enhancers have failed to identify significantly recurrent mutations with a strong transcrptional impact on cancer genes. The main reason for that is the difficulty to predict the functional consequence of non-coding mutations. Although these mutations can alter important regulatory regions and modulate the expression of key cancer genes, there is no established method to predict the transcriptional impact of a non-coding mutation. To fill this gap, we will develop a deep neural network able to predict gene expression based on the local sequence context. Pioneer studies have demonstrated the ability of deep neural networks to learn how to recognize several regulatory motifs from the DNA sequence, including splicing sites, chromatin accessibility and 3D conformation or transcription factor binding sites. More recently, Olga Troyanskaya’s team has developed a deep neural network integrating able to predict, from the DNA sequence, the expression level of genes in a cell-type specific manner, by integrating predictions of chromatin state and transcription factor binding. Once trained, these neural networks are able to predict in silico the regulatory impact of any sequence variant, and are thus extremely valuable assets to identify disease coding variants. Deep learning analysis has been used to identify causal variants in several diseases including autism, but have not yet been applied to cancer. Our hypothesis is that leveraging the power of deep neural network to explore the millions of somatic alterations identified in cancer sequencing projects is a promising approach to uncover the missing driver events involving the non-coding human genome.

ZHOU Anqi

PhD Student

Institut Pasteur

anqi.zhou [at] pasteur.fr

Short bio

BSc. Applied Mathematics, BA. Neuroscience
MSc. Biotechnology, Brown University, USA

Thesis title

Rapidly identifying therapeutics of Alzheimer’s Disease using millions of Drosophila larvae and amortized inference.

Short abstract

Alzheimer’s Disease (AD) affects millions of people worldwide, yet the limited treatments address only the physiological symptoms instead of the cause of pathogenesis. The goal of this PhD project is to establish a new pipeline for measuring AD phenotypes that leverages the advantages of Drosophila as a model system for circuit studies and links probabilistic behavior to disease progression. The pipeline builds on automated machine learning to rapidly analyze data from millions of larvae.

SAUTY DE CHALON Benoit

PhD student

Cognition

INRIA

benoit.sauty-de-chalon [at] inria.fr

Short bio

Diplôme ingénieur Ecole Polytechnique

Thesis title

Multimodal modelling of neurodegenerative diseases.

Short abstract

The goal is to find quantitative links between the decay of structural properties of the brain, shown through imaging techniques such as MRI/Pet scans/etc and the decay of cognitive abilities of the patients, shown through cognitive assessment tests. The research focuses on Alzheimer and Parkinson patients.

D’ASCOLI Stéphane

PhD student

L’Ecole normale supérieure - PSL / FAIR Paris

stephane.dascoli [at] gmail.com

Short bio

Master in Theoretical Physics, ENS Paris

Thesis title

Deep learning: from toy models to modern architectures.

Short abstract

My research focuses on understanding how deep neural networks are able to generalize despite being heavily overparametrized. On one hand, I use tools from statistical mechanics to study simple models, and try to understand when and why they overfit. On the other hand, I investigate how different types of inductive biases affect learning, from fully-connected networks to convolutional networks to transformers.

HAIRAULT Adrien

PhD student

Data science

PSL

hairault [at] ceremade.dauphine.fr

Autonomous agents and multi-agent systems

Short bio

MSc in Statistical Science, Oxford University
Double licence M.I.A.S.H.S, Université Paris 1 Panthéon-Sorbonne & SciencesPo Paris

Thesis title

Foundations and applications in Bayesian Mixture Modelling.

Short abstract

Mixtures are a popular class of models bridging parametric and non-parametric statistics and, as part of the standard data analysis toolkit, have ubiquitous applications in regression, clustering, machine learning, etc. One of the main goals of this thesis is to ease model selection within such a class of models, in particular by finding efficient ways of computing the marginal likelihood (aka evidence) of semi-parametric models (such as Dirichlet Process Mixtures). We also study the convergence properties of the Bayes Factor when comparing such parametric and semi-parametric models.

ALLOUCHE Tahar

PhD Student

Université Paris-Dauphine and CNRS

tahar.allouche [at] dauphine.eu

Short bio

Mathematical Engineering degree from ENSTA Paris – M2 Optimization from Paris-Saclay university

Thesis title

Learning societal preferences for automated collective decision making.

Short abstract

We study sophisticated models of agents’ preferences as data structure in a learning framework for collective decision aiding. Statistical, computational and epistemic aspects of the preferences are considered in order to thoroughly explore their structure and efficiently infer optimal decisions.

YAMANE Ikko

Postdoctoral researcher

Dauphine - PSL

ikko.yamane [at] dauphine.psl.eu