VERDIER Hippolyte

PhD Student

Institut Pasteur

hverdier [at] @pasteur.fr

Short bio

Ingénieur polytechnicien, École polytechnique (Palaiseau)

MPhil in Computational biology, University of Cambridge (UK)

Thesis title

Combine artificial intelligence with high resolution microscopy to better dissect the mechanism of binding and mechanism of action of multi-specific biologics.

Short abstract

Photo-activated localization microscopy (PALM) enables high-resolution recording of single proteins trajectories in live cells, thus providing precious probes of small-scale properties of the protein environment. I use graph neural networks to characterize relevant physical properties of these dynamics, and developed a flexible analysis scheme able to deal both with the diversity of motion types encountered in nature and with the fact that observed trajectories inevitably differ, to some extent, from archetypal theoretic models.

TRIBOULIN Amaury

PhD Student

Inria

amaury.triboulin [at] inria.fr

Short bio

Master’s Degree at Ecole normale supérieure

Thesis title

Symmetries in Machine Learning for Structured Data.

Short abstract

In this thesis, we will consider high-dimensional problems with an additional structure that comes from the geometry of the input signal and explore ways to incorporate this geometric structure into the learning algorithms. We have already started to investigate new architectures based on equivariant layers which we tested on combinatorial optimization problems and showed that it is possible learn representations of hard (typically NP-hard) problems. We believe this could lead to new algorithms, less resource-dependant, for learning efficient heuristics for practical instances.

LASRI Karim

PhD Student

L’Ecole normale supérieure - PSL

karim.lasri [at] ens.fr

Short bio

Engineer’s degree at Ecole CentraleSupélec (former Ecole Centrale Paris)

Master’s degree (1st year) in Cognitive Science at the Ecole Normale Supérieure

Thesis title

Linguistic generalization in transformer-based neural language models.

Short abstract

Transformer-based neural architectures bear lots of promises as they seem to address a wide range of linguistic tasks after learning a language model. However, the level of abstraction they reach after their training is still opaque. My main research focus is understanding better how neural language models generalize. What linguistic properties do these architectures acquire during learning ? How is linguistic information encoded in their intermediate representation spaces?

DO Virginie

PhD Student

Dauphine - PSL

virginie.do [at] dauphine.eu

Short bio

MSc in Applied Mathematics / Diplôme d’Ingénieur – Ecole Polytechnique

MSc in Social Data Science – University of Oxford

Thesis title

Fairness in machine learning: insights from social choice.

Short abstract

Designing fair algorithms has recently appeared as a major issue in machine learning, and more generally in AI, while it has been studied for long in economics, especially in social choice theory.  My goal is to bring together the notions of fairness of the two communities, and leverage the concepts and mathematical tools of social choice to address the new challenges of fairness in machine learning.

ANDRAL Charly

PhD Student

Dauphine - PSL

andral [at] ceremade.dauphine.fr

Short bio

Diplome d’ingénieur – ENSAE Paris

Master Statistics And Machine Learning, Paris Saclay University

Thesis title

Improvement of MCMC methods and adaptation to the Big Data.

Short abstract

MCMC methods can have some difficulties exploring space, especially in high dimensional settings that can occur in a context of Big Data. The goal of my PhD thesis is to find enhancements to MCMC about this exploring issue.

Thibeau-Sutre Elina

PhD Student

Paris Brain Institute

elina.ts [at] free.fr

Short bio

Master degree (Diplôme d’ingénieur) at Ecole des Mines de Paris

Master degree in bio medical engineering (ESPCI, Université Paris Descartes, Arts et Métiers)

Thesis title

Unsupervised learning from neuroimaging data to identify disease subtypes in Alzheimer’s disease and related disorders.

Short abstract

The objective of my PhD thesis is to develop and evaluate clinically-relevant approaches for unsupervised learning to characterize disease heterogeneity in AD and related dementias. Specific objectives include: 1) To adequately account for normal variability. For instance, in a clustering approach, the aim would be to cluster the deviations from normal variability, rather than the raw characteristics of the patients. 2) To design approaches that can handle the structure and high-dimensionality of data of neuroimaging data. 3) To define clinically-relevant measures to assess the results of the unsupervised learning.

GODARD Charlotte

PhD Student

Institut Pasteur

charlotte.godard [at] pasteur.fr

Short bio

Engineer degree – TELECOM PHYSIQUE STRASBOURG

Master degree in Imaging, Robotics and Engineering for Healthcare – UNIVERSITY OF STRASBOURG

Thesis title

Semi-automatic and amortized developments of transfer function for surgery planning in virtual reality.

Short abstract

Interpretation of medical images, such as MRI or CT-scan, can be challenging for a non-radiologist expert because of the various image quality and of the similarities between different structures of interest. However, surgeons need to understand these images to prepare surgeries and define corresponding anatomical landmarks. As universal segmentation is not possible due to the diversity of images between patients, we focused on the optimization of the visualization process applied only on the raw data. The AVATAR MEDICAL platform uses virtual reality for an intuitive visualization and manipulation of the images. Visualization parameters (color, transparency) are currently defined manually using an user-friendly transfer function desktop interface. The objective of the thesis is to automate the transfer function generation for a faster isolation of the structures of interest in the image, by combining a statistical approach and pre-trained models.

Mishra Shrey

PhD Student

L’Ecole normale supérieure - PSL

Shrey.Mishra [at] ens.fr

Short bio

Manipal University (India, BTech)

Cesi school of Engineering (Software majors, Ecole de engineer)

Munster Technological University (MSc Artificial Intelligence)

Thesis title

Extracting information related to the Scientific Articles published and making a knowledge base out of it, with the application of various AI / Machine learning based techniques.

Short abstract

Every years thousand’s of scientific papers are published in the academia covering various scientific proofs theorems and relations in a form of a Pdf document. I am enrolled in a TheoremKb (A project led by Pierre Senellart) to extract information from the scientific articles while training Machine learning models to identify / relate various documents together based upon the information expressed in the article (including the mathematical proof’s).

Kmetzsch Virgilio

PhD Student

INRIA

virgilio.kmetzsch [at] inria.fr

Short bio

MSc in Data Science – Grenoble INP Ensimag & UGA

Thesis title

Multimodal analysis of neuroimaging and transcriptomic data in genetic frontotemporal dementia.

Short abstract

Frontotemporal dementia (FTD) is a devastating neurodegenerative disease with no effective treatments so far. The Paris Brain Institute has assembled one of the largest cohorts worldwide on genetic forms of FTD, comprising multimodal data including neuroimaging (MRI, PET), cognition and transcriptomic (RNA-seq). The present PhD project aims at designing and applying new approaches for integrating multimodal transcriptomic and neuroimaging data, to characterize biomarkers of the presymptomatic phase of the disease, in order to design upcoming therapeutic trials.

TEBOUL Raphaël

PhD Student

INSERM

raphael.teboul [at] inserm.fr

Short bio

Master degree in Engineering at Telecom Paris

Thesis title

Unravelling non-coding driver alterations in cancer with deep learning.

Short abstract

Of the 3 gigabases that constitute the human genome, only about 50 megabases (<2%) encode protein-coding genes. Particular attention has been paid to somatic mutations affecting the coding sequence of these genes, leading to the almost exhaustive characterization of 723 genes implicated in cancer (cancer gene census, COSMIC database, September 2019). By contrast, at the notable exception of TERT promoter mutations that induce the expression of telomerase (a key enzyme necessary for unlimited cell proliferation), very few driver alterations have been identified in the non-coding genome. Analysis of mutation hotspots or known regulatory regions like promoters and enhancers have failed to identify significantly recurrent mutations with a strong transcrptional impact on cancer genes. The main reason for that is the difficulty to predict the functional consequence of non-coding mutations. Although these mutations can alter important regulatory regions and modulate the expression of key cancer genes, there is no established method to predict the transcriptional impact of a non-coding mutation. To fill this gap, we will develop a deep neural network able to predict gene expression based on the local sequence context. Pioneer studies have demonstrated the ability of deep neural networks to learn how to recognize several regulatory motifs from the DNA sequence, including splicing sites, chromatin accessibility and 3D conformation or transcription factor binding sites. More recently, Olga Troyanskaya’s team has developed a deep neural network integrating able to predict, from the DNA sequence, the expression level of genes in a cell-type specific manner, by integrating predictions of chromatin state and transcription factor binding. Once trained, these neural networks are able to predict in silico the regulatory impact of any sequence variant, and are thus extremely valuable assets to identify disease coding variants. Deep learning analysis has been used to identify causal variants in several diseases including autism, but have not yet been applied to cancer. Our hypothesis is that leveraging the power of deep neural network to explore the millions of somatic alterations identified in cancer sequencing projects is a promising approach to uncover the missing driver events involving the non-coding human genome.