ZINOVYEV Andrei

Computational biology

andrei.zinovyev [at] curie.fr / Twitter: @SysBioCurie

Andrei Zinovyev

Short bio

Senior permanent researcher at Institut Curie and a scientifi c coordinator of Computational Systems Biology of Cancer group inside the Bioinformatics department (2005-). Postdoctoral fellow at Institut des Hautes Etudes Scientifi ques (IHES) (2001-2005). Habilitation in biology at Ecole Normale Superieur in Paris (2014).

Topics of interest

Machine Learning, Unsupervised learning, High-dimensional geometry, Omics data, Mathematical Modeling, Cancer biology

Project in Prairie

Andrei Zinovyev will focus on developing and adapting methods for learning latent spaces and structures in high-dimensional data, with principal applications to the biomedical data analysis. The main research line inside PRAIRIE will be on learning representations of multi-omics and single cell data. Andrei Zinovyev will implement a teaching course on applications of machine learning in molecular oncology.

Quote

Modern datasets in biology and medicine contain millions of objects (patients, biopsies, tumors, cells) characterized by hundreds of thousands of features such as expression of genes and proteins, properties of DNA or concentration of metabolites. How to use these data in order to make discoveries in biology or propose a better disease treatment? We can learn a lot by investigating the corresponding high-dimensional data point clouds, whose intrinsic geometry is shaped by biological processes, experimental designs and technical biases and is aff ected by the heterogeneity and uncertainty of molecular measurements. With machine learning methods allowing us to explore complex multidimensional data structures, one can tackle the problem of extracting the most relevant part of the information contained in omics data and using it further in the most effi cient way.

LETOUZÉ Eric

Cancer genomics

eric.letouze [at] inserm.fr

Eric Letouze

Short bio

Senior INSERM researcher, leader of the computational biology group within the « Functional Genomics of Solid Tumors » team at Cordeliers Research Center. INSERM excellence award (2015). Institut Necker Fondation Tourre best post-doctoral student award (2015).

Topics of interest

Cancer genomics, bioinformatics, machine learning

Project in Prairie

Discovering cancer-causing mutations using deep learning approaches. Eric Letouzé will develop deep learning approaches to predict celltype specific regulatory features of gene expression, splicing and translation from the DNA sequence, and use these tools to discover new driver events among the millions of non-coding mutations identified in human cancer genomes.

Quote

Next-generation sequencing technologies have allowed the identification of millions of mutations in tens of thousands of tumor samples. Yet, the vast majority of driver mutations functionally associated with cancer development lies within <2% of the genome encoding protein-coding genes. Although non-coding mutations can dramatically modulate the expression of cancer genes, predicting their precise functional impact remains extremely challenging. By developing deep neural networks able to learn regulatory features from the DNA sequence, we will be able to predict which mutations are likely to alter the expression of oncogenes and tumor suppressors, and unravel the missing drivers within the huge pancancer mutation catalogues available in public databases.

GASCUEL Olivier

Computational Biology

olivier.gascuel [at] pasteur.fr

Olivier Gascuel

Short bio

Research Director at CNRS. Head of the Department of Computational Biology at Institut Pasteur. Associate Editor of Systematic Biology. Fast Breaking Paper 2005 and Current Classic in Environment and Ecology from 2007 to 2011 (most cited paper in the field, Science Watch – Thomson Reuters). Silver Medal in Computer Science of the CNRS, 2009. Grand Prix Inria – Académie des Sciences for Numerical Sciences, 2017.

Topics of interest

Computational biology, genomics, evolution, pathogens

Project in Prairie

Olivier Gascuel’s research will focus on the analysis of genomic data. Modeling, statistical/deep learning and algorithmics will be combined to take advantage of the evolutionary relationships among sequences and solve key questions on the function of pathogenic genes, the emergence of drug resistances, and the dynamics of epidemics. He will develop interdisciplinary courses intended to a wide audience.

Quote

The amount of genomic data is increasing exponentially rate. These data contain a wealth of information on diseases, biodiversity, and many other important societal issues. The analysis of these data imposes constantly renewed challenges, on the algorithmic level and that of modeling. We are helped in this task by the traces left by evolution in the genes and genomes of species, as predicted by Theodosius Dobzhansky in his famous sentence “Nothing in biology makes sense except in the light of evolution” (1973). Evolutionary approaches combined with the latest advances in AI, especially deep learning, will be key to harnessing today’s and tomorrow’s genomic data, and solving key questions in biology and health.

CHIKHI Rayan

Bioinformatics

rayan.chikhi [at] pasteur.fr

Rayan Chikhi

Short bio

Recently appointed G5 group leader at Institut Pasteur and CNRS researcher, in the Department of Computational Biology, Sequence Bioinformatics team. Previously affiliated with Université de Lille.

Topics of interest

Analysis of DNA sequencing data, algorithms, data structures

Project in Prairie

The project has the ambitious goal of finding new genetic determinants in a common form of Alzheimer’s disease. We will combine algorithms, machine learning and statistical techniques to mine through large amounts of DNA sequencing data. The plan is to develop new computational methods to perform an initial analysis of raw sequencing data, and then apply supervised machine learning methods to detect clinically relevant variants.

Quote

This project fosters connections with three disciplines: sequence bioinformatics, AI, and a high-profile clinical application. It is thus part of a biological and interdisciplinary side of PRAIRIE. We will also tackle analysis of ‘very big data’, as each human genome yields around 100 gigabases of raw data, and studied cohorts typically gather thousands of samples or more.

BARILLOT Emmanuel

Computational Molecular Oncology

emmanuel.Barillot [at] curie.fr

Emmanuel Barilllot

Short bio

Director of U900 Research Department (Institut Curie – INSERM – PSL Research University /Mines ParisTech), Head of U900 Computational Systems Biology of Cancer team, Director of Institut Curie Bioinformatics Core Facility, Chair at Paris Artificial Intelligence Research Institute (PRAIRIE).

Topics of interest

Computational molecular oncology, systems biology of cancer, Biological Network modeling, omics data analysis

Project in Prairie

Emmanuel Barillot’s research focuses on Computational Systems Biology of Cancer. It aims at understanding the molecular basis of cancer using large-scale molecular profiles (omics) and clinical records, and at predicting disease evolution, potential therapeutic targets, and treatment outcome (precision medicine). To achieve this goal I develop computational approaches based on machine learning, network and prior knowledge modeling.

Quote

Molecular and phenotypic data about tumors and models are accumulating
at an ever-increasing pace, and is becoming a routine source of information in the medical setting, thanks to lowering costs and improved biotechnological devices (DNA sequencing, mass spectrometry, imaging…). As a consequence, the bottleneck in cancer research has shifted from data acquisition to computational analysis. We still lack powerful computational models and analytical approaches to convert our deepened observations into full understanding of the biology of cancer and to optimize the benefit for patients. My work at the intersection of molecular oncology, mathematical modeling and machine learning is designed to overcome these limitations.

Azencott Chloé-Agathe

Computational Biology

chloe-agathe.azencott [at] mines-paristech.fr

Chloe-Agathe Azencott Nicolas Ravelli

Short bio

Assistant professor at the Centre for Computational Biology (CBIO) of MINES ParisTech and Institut Curie (since 2013). Recipient of an ANR Young Researcher grant (2019-2021) and member of an H2020 Initial Training Network (2019- 2022). Instructor at Open Classrooms. Co-founder of Paris Women in Machine Learning and Data Science.

Topics of interest

learning, statistical genetics, genomics, precision medicine

Project in Prairie

Chloé-Agathe Azencott will address feature selection in high-dimensional, heterogeneous data, with applications to biomarker discovery from multi-omics data. She will most notably focus on using biological networks both to constrain the feature selection problem and to facilitate the integration of heterogeneous datatypes. She will teach courses on high-dimensional machine learning, as well as courses with a focus on omics data.

Quote

Many of the molecular data sets collected in the context of precision medicine and health pose statistical and machine learning challenges that are very different from those encountered in most artificial intelligence applications. Indeed, we are facing a setting where data are scarce and high-dimensional – there are orders of magnitudes more nucleotides in a human genome than patients suffering from a specific disease. This is therefore an exciting field providing us with many open problems and challenges.