To receive PRAIRIE news and colloquium announcements sign up for PRAIRIE mailing list.
(In case this automatic link does not work, send an email to firstname.lastname@example.org with the subject: subscribe prairie_news [your e-mail address].)
25th PRAIRIE seminar: 12 December at 14h
Speaker: George Deligiannidis, University of Oxford
Title: Quantitative Uniform Stability of the Iterative Proportional Fitting Procedure
After obtaining my PhD from the School of Mathematical Sciences of the University of Nottingham under the supervision of Sergey Utev and Huiling Le, I moved to the Department of Mathematics of the University of Leicester as a Teaching Assistant/Fellow. In 2012 I moved to the Department of Statistics of the University of Oxford as Departmental Lecturer. I stayed in Oxford until September 2016 when I moved to the Department of Mathematics of King’s College London as Lecturer in Statistics. I moved back to the University of Oxford in December 2017 as Associate Professor of Statistics
We establish the uniform in time stability, w.r.t. the marginals, of the Iterative Propor- tional Fitting Procedure, also known as Sinkhorn algorithm, used to solve entropy-regularised Optimal Transport problems. Our result is quantitative and stated in terms of the 1- Wasserstein metric. As a corollary we establish a quantitative stability result for Schrödinger bridges.
This is joint work with V. de Bortoli and A. Doucet.
24th PRAIRIE seminar: 7 November at 14h
Speaker: Thierry Poibeau, CNRS
Title: Artificial Intelligence and Society: What would a better AI mean?
— Bio —
CNRS Research Director, Head of the CNRS Lattice research unit (2012-2018) and adjunct head since 2019. Affiliated lecturer, Language Technology Laboratory, U. of Cambridge since 2009. Rutherford fellowship, Turing institute, London, 2018-2019. Teaching NLP in the PSL Master in Digital Humanities.
— Resumé —
Artificial Intelligence (AI) has made huge progress in the last few years. Applications are now deployed and have a real impact on society. The press regularly echoes concerns, from the general public as well as from professionals and even researchers themselves: if AI has achieved human-like performance on various tasks, should we fear the consequences? For example, the production of ‘fake news’ and ‘deep fake’ on a large scale can be a danger for democracy. If language models reflect or even amplify the biases of the training data, there is a risk of discrimination. etc.
In this presentation, we will come back to these thorny and topical questions. We will remind some well-known cases, which have made the headlines, where AI has been called into question in various ways. It seems pretty clear that some scandals could have been avoided and were due to problematic deployment of poorly developed systems. However, beyond that, we will show that the issues raised are complex: the notion of bias, for example, implies the idea of a norm. Who sets the standard? And, if unbiasing the models seems a laudable goal in itself, who could decide what a neutral, unbiased model would be? The notion of human or superhuman performance (which suggests a risk of loss of control of the human against the machine) must also be questioned: we still seem far from a general, autonomous AI, able to take power against humans.
In the end, our position is close to that of Kate Crawford: AI is too often described as an autonomous force, whereas it is made by humans, for humans, with specific interests that have to be unraveled. It is also clear that we, as researchers, have our responsibilities too and we cannot hide behind the supposed neutrality of technology. A better account of what the technology can do, and cannot do, would help raise the debate on these important questions.
23rd PRAIRIE seminar: 27 September 2022 at 11h
- Conference room of the Centre Sciences des Données, 45 rue d’Ulm, 3rd floor between stairs B and C
- Connection link : https://u-paris.zoom.us/j/82231267433?pwd=SHl6YkpIM3ZFck5oNTN4UWR1dkRldz09
Speaker: Masashi Sugiyama, RIKEN/The University of Tokyo
Title: Recent advances in robust machine learning
— Bio —
Masashi Sugiyama received a Ph.D. in Computer Science from Tokyo Institute of Technology in 2001. He has been a Professor at the University of Tokyo since 2014 and concurrently Director of the RIKEN Center for Advanced Intelligence Project (AIP) since 2016. His research interests include theories and algorithms of machine learning. In 2022, he received the Award for Science and Technology from Japan’s Minister of Education, Culture, Sports, Science, and Technology. He served as Program Co-chairs for Neural Information Processing Systems (NeurIPS) Conference in 2015, International Conference on Artificial Intelligence and Statistics (AISTATS) in 2019, and Asian Conference on Machine Learning (ACML) in 2010 and 2020. He (co)authored Machine Learning in Non-Stationary Environments (MIT Press, 2012), Density Ratio Estimation in Machine Learning (Cambridge University Press, 2012), Statistical Reinforcement Learning (Chapman & Hall, 2015), Introduction to Statistical Machine Learning (Morgan Kaufmann, 2015), and Machine Learning from Weak Supervision (MIT Press, 2022).
— Resumé —
When machine learning systems are trained and deployed in the real world, we face various types of uncertainty. For example, training data at hand may contain insufficient information, label noise, and bias. In this talk, I will give an overview of our recent advances in robust machine learning, including weakly supervised classification (positive-unlabeled classification, positive-confidence classification, complementary-label classification, etc), noisy label learning (noise transition estimation, instance-dependent noise, clean sample selection, etc.), and domain adaptation (joint importance-predictor learning for covariate shift adaptation, dynamic importance-predictor learning for full distribution shift, etc.).
23rd PRAIRIE seminar: 6 September 2022 at 14h
Speaker: Francis Bach
Title: Information theory through kernel method.
— Bio —
Researcher at Inria, leading since 2011 the machine learning team which is part of the Computer Science department at Ecole Normale Supérieure. Ph.D. Berkeley (2005). ERC Starting grant (2009) and Consolidator Grant (2016), Inria young researcher prize (2012), ICML test-of-time award (2014), Lagrange prize in continuous optimization (2018). Co-editor-in-chief of the Journal of Machine Learning Research. Member of the Academy of Sciences.
— Resumé —
Estimating and computing entropies of probability distributions are key computational tasks throughout data science. In many situations, the underlying distributions are only known through the expectation of some feature vectors, which has led to a series of works within kernel methods. In this talk, I will explore the particular situation where the feature vector is a rank-one positive definite matrix, and show how the associated expectations (a covariance matrix) can be used with information divergences from quantum information theory to draw direct links with the classical notions of Shannon entropies.
22nd PRAIRIE seminar: 22 Juin 2022 at 14h
Speaker: Xiaoyi CHEN
Title: Patient phenotypic similarity for diagnosis of rare diseases.
— Bio —
Xiaoyi Chen is researcher at Institut Imagine, a research institute specialized in genetic diseases. Her research focuses on automated methods to identify rare disease patients in huge real-world-data repositories. She received her PhD in applied mathematics and computational biology at Institut Pasteur, Paris (2015). Between 2016 and 2022, she was a researcher in the Information Sciences to support Personalized Medicine group at Inserm UMR 1138 (now team HeKA Inria-Inserm-Université Paris Cité).
— Resumé —
Many rare diseases suffer from important delayed- or underdiagnosis issues due to a broad spectrum of phenotypes and high genetic and clinical heterogeneity. One solution to accelerate the diagnosis process is to rely on patients’ electronic health records (EHRs) for automatic phenotyping and develop algorithms to identify from large scale clinical data warehouse patients having similar profiles to those from already diagnosed patients.
In this talk, I will summarize recent efforts in the context of RHU C’IL-LICO project, to develop diagnosis support systems that takes into consideration the semantic relations between clinical concepts and the different levels of relevance presented in patients’ EHRs – including incompleteness, inaccurate phenotyping, noisy phenotypes related to multiple comorbidities and medical histories, as well as the clinical heterogeneity of complex rare diseases and the important imbalance issues.
21st PRAIRIE seminar: 10 May 2022 at 14h
Speaker: Alessandro Rudi
Title: Representing non-negative functions, with applications to non-convex optimization and beyond
— Bio —
Alessandro Rudi is Researcher at INRIA, Paris from 2017. He received his PhD in 2014 from the University of Genova, after being a visiting student at the Center for Biological and Computational Learning at Massachusetts Institute of Technology. Between 2014 and 2017 he has been a postdoctoral fellow at Laboratory of Computational and Statistical Learning at Italian Institute of Technology and University of Genova.
— Résumé —
Many problems in applied mathematics admit a natural representation in terms of non-negative functions, e.g. probability representation and inference, optimal transport, optimal control, non-convex optimization, to name a few. While linear models are well suited to represent functions with output in R or C, being at the same time very expressive and flexible, the situation is different for the case of non-negative functions where the existing models lack one of these good properties.
In this talk we present a model for non-negative functions that promises to bring to these problems, the same benefits that linear models brought to interpolation, approximation, quadrature and supervised learning, leading to a new class of adaptive algorithms with provably fast convergence.
In particular, we will show direct applications in numerical methods for probability representation and non-convex optimization. We will see more in detail that the model allows to derive an algorithm for non-convex optimization that is adaptive to the degree of differentiability of the objective function and achieves optimal rates of convergence. Finally, we show how to apply the same technique to other interesting problems in applied mathematics that can be easily expressed in terms of inequalities.
Ulysse Marteau-Ferey , Francis Bach, Alessandro Rudi. Non-parametric Models for Non-negative Functions. https://arxiv.org/abs/2007.03926
Alessandro Rudi, Ulysse Marteau-Ferey, Francis Bach. Finding Global Minima via Kernel Approximations. https://arxiv.org/abs/2012.11978
Alessandro Rudi, Carlo Ciliberto. PSD Representations for Effective Probability Models. https://arxiv.org/pdf/2106.16116.pdf
20th PRAIRIE seminar: 6 April 2022 at 15h
Speaker: Justin Solomon
Title: Application-Driven Geometric Machine Learning
— Bio —
Justin Solomon is an associate professor of Electrical Engineering and Computer Science in the MIT Computer Science and Artificial Intelligence Laboratory. He runs the MIT Geometric Data Processing group, which studies problems at the intersection of geometry, large-scale optimization, and applications in machine learning, graphics, and vision.
— Résumé —
From 3D modeling to autonomous driving, a variety of applications can benefit from data-driven reasoning about geometric problems. The available data and preferred shape representation, however, varies widely from one application to the next. Indeed, the one commonality among most of these settings is that they are not easily approached using data-driven methods that have become de rigueur in other branches of computer vision and machine learning. In this talk, I will summarize recent efforts in my group to develop learning architectures and methodologies paired to specific applications, from point cloud processing to mesh and implicit surface modeling. In each case, we will see how mathematical structures and application-specific demands drive our design of the learning methodology, rather than bending application demands or ignoring geometric details to apply a standard data analysis technique.
19th PRAIRIE seminar: 9 March 2022 at 14h
Speaker: Stéphanie Allassonnière
Title: Data Augmentation in High Dimensional Low Sample Size Setting Usinga Geometry-Based Variational Autoencoder
— Bio —
Professor of Mathematics at the School of Medicine, University of Paris and associated Professor in the applied Mathematics department of Ecole Polytechnique. Manager of master programs and masterclasses in AI in healthcare.
— Résumé —
In this presentation, we propose a new method to perform data augmentation in a reliable way in the High Dimensional Low Sample Size (HDLSS) setting using a geometry-based variational autoencoder. Our approach combines a proper latent space modeling of the VAE seen as a Riemannian manifold with a new generation scheme which produces more meaningful samples especially in the context of small data sets. The proposed method is tested through a wide experimental study where its robustness to data sets, classifiers and training samples size is stressed. It is also validated on a medical imaging classification task on the challenging ADNI database where a small number of 3D brain MRIs are considered and augmented using the proposed VAE framework. In each case, the proposed method allows for a significant and reliable gain in the classification metrics. For instance, balanced accuracy jumps from 66.3% to 74.3% for a state-of-the-art CNN classifier trained with 50 MRIs of cognitively normal (CN) and 50 Alzheimer disease (AD) patients and from 77.7% to 86.3% when trained with 243 CN and 210 AD while improving greatly sensitivity and specificity metrics.
18th PRAIRIE seminar: 16 February 2022 at 14h
Speaker: Abdellah Fourtassi
Title: How AI can help us study the complexity of children’s early language acquisition
— Bio —
I am currently a researcher (“délégation recherche”) at INRIA Paris, visiting fromAix-Marseille University where I am Assistant Professor (Maître de Conférence) of computer science since late 2019. I am also a research fellow at the Institute of Language, Communication, and the Brain (ILCB) where I direct the interdisciplinary research group “Computational Communication, and Development” (cocodev.fr). Prior to that, I was a postdoctoral research fellow at Stanford University. I completed my PhD at Ecole Normale Supérieure rue d’Ulm and my undergraduate studies at Ecole Polytechnique.
— Résumé —
To acquire language, children need to learn the form (e.g., phonology and syntax), the content (e.g., word and sentence meanings), and the use (e.g., finding the right words to convey communicative intents). Research in language development has traditionally simplified this process by studying these dimensions separately. The reality of the situation is that children have to deal with aspects of form, content, and use simultaneously. In addition, experimental studies suggest that the timelines of acquisition of these dimensions largely overlap, indicating that children learn them in parallel, not one at a time. While this fact makes language acquisition seem even harder than we previously thought, here I show that the joint learning of form, content, and use may actually be more a help than a hindrance: These dimensions are *interdependent* in many ways and can therefore constrain/disambiguate each other.
More generally, I argue that research into the complex interaction/synergy across linguistic levels, during child development, requires going beyond traditional research tools in the field of child development (e.g., controlled experiments) and integrating cutting-edge methods from AI in our research toolkit. This new research method is instrumental not only in piercing some lingering mysteries in children’s language learning but also in understanding the development of this complex phenomenon in its natural context (e.g., as opposed to in-lab studies), thus facilitating the translation of scientific findings much more easily into real-life interventions and societal applications.
17th PRAIRIE seminar: 19 January 2022 at 14h
Speaker: Marc Lelarge, Inria
Title: ” Exploiting Graph Invariants in Deep Learning”
— Bio —
Dr. Marc Lelarge is a researcher at INRIA. He is also a lecturer in deep learning at Ecole Polytechnique (Palaiseau, France) and Ecole Normale Superieure. He graduated from Ecole Polytechnique, qualified as an engineer at Ecole Nationale Superieure des Telecommunications (Paris) and received a PhD in Applied Mathematics from Ecole Polytechnique in 2005. Recipient of the 2012 SIGMETRICS rising star researcher award and the 2015 Best Publication in Applied Probability Award with Mohsen Bayati and Andrea Montanari for their work on compressed sensing.
— Résumé —
Geometric deep learning is an attempt for geometric unification of a broad class of machine learning problems from the perspectives of symmetry and invariance. In this talk, I will present some advances of geometric deep learning applied to combinatorial structures. I will focus on various classes of graph neural networks that have been shown to be successful in a wide range of applications with graph structured data.
16th PRAIRIE seminar: 8 December 2021 at 14h
Speaker: Jean-François Ethier, Université de Sherbrooke
Title: “Learning health systems to support data intensive research: a Canadian perspective”
— Bio —
Jean-François Ethier is associate researcher at Unit 1138 Cordeliers Research Center at INSERM and Paris Descartes University, and at the Research Center of the University Hospital of Sherbrooke. He is a specialist in general internal medicine.
Jean-François Ethier’s work focuses on learning health systems. In particular, it centres on methods of access to health data, research systems and clinical decision aid tools where citizens play active roles. He develops theoretical approaches and concrete tools so that information and research systems can communicate with each other.
His research influences how biomedical data warehouses are structured. It also allows to combine a variety of health data that is stored in computer systems that operate differently.
Jean-François Ethier develops ontologies and biomedical terminologies. These integrate diverse and heterogeneous databases within a unified data network. These solutions facilitate the flow of information between science and clinical practice. They propel health research while supporting health care professionals who make many clinical decisions every day.
Jean-François Ethier participated in the development of the TRANSFoRm project in Europe (project funded by the European Community under the FP7 program). TRANSFoRm has created a prototype of a learning health system to support primary health care and services.
— Résumé —
The learning health system (LHS) approach is increasingly regarded as an optimal paradigm to foster concrete health improvements for the population. AI plays a significant role to offer new insights on health data, yet difficulties in accessing data currently curtails its potential. By placing it in the context of learning health systems, it is possible to foster data access securely and ethically while ensuring the evaluation of anticipated benefits for care delivery. Canada is currently implementing structures at the national, provincial and regional domains to facilitate this. The presentation will therefore briefly present how AI can fit in the LHS paradigm, explore challenges regarding this integration and discuss Canadian organisations supporting it, like the Health Data Research Network Canada.
15th PRAIRIE seminar: 17 November at 14h
Speaker: Stanley Durrleman
Title: “Modelling and predicting the progression of neurodegenerative diseases”
— Bio —
Dr. Stanley Durrleman is senior researcher at Inria, head of the ARAMIS Lab at the Paris Brain Institute (ICM) on the campus of the Pitié-Salpêtrière hospital. He is fellow of the Paris AI research institute (PRAIRIE). He holds a PhD in applied mathematics from the university of Nice (2010) and a habilitation from Sorbonne University in Paris (2018). His research interests lie in the field of mathematical modeling and statistical learning applied to imaging and medical data. S. Durrleman has received several awards including the second Gilles Kahn award for the best dissertation in computer science in 2010, a starting grant from the European research council (ERC) in 2015 and was the first laureate of a Sanofi iDEA award outside of the USA in 2019. In 2020, he received the Inria – Académie des Sciences young researcher award.
— Résumé —
In this talk, we will review disease course mapping, a statistical technique aiming to chart the range of trajectories of a series of imaging biomarkers and clinical endpoints changing during disease progression. The technique relies on differential geometric principles and may be used for any data that can be represented on Riemannian manifolds. It uniquely decompose variations due differences in the dynamics of the progression from differences due to the presentation of the disease.
We will show that this technique can forecast the values of the biomarkers and clinical endpoints with smaller errors than state-of-the-art methods. Such predictions can be used, in turn, to design clinical trials with better statistical power by selecting patients with homogeneous progression profiles.
We will illustrate these methods on three therapeutic areas: Alzheimer, Parkinson and Huntington disease.
14th PRAIRIE seminar 6 October at 14h
Speaker: Ron Kimmel (Technion)
Title: « On Geometry and Learning »
—- Bio —-
Ron Kimmel is a Professor of Computer Science and Electrical & Computer Eng. (by courtesy) at the Technion where he holds the Montreal Chair in Sciences. He held a post-doctoral position at UC Berkeley and a visiting professorship at Stanford University. He has worked in various areas of shape reconstruction and analysis in computer vision, image processing, deep learning of big geometric data, and computer graphics. Kimmel’s interest in recent years has been understanding of machine learning, medical imaging and computational biometry, optimization of solvers to problems with a geometric flavor, and applications of metric, spectral, Riemannian, and differential geometries. Kimmel is an IEEE Fellow and SIAM Fellow for his contributions to image processing, shape reconstruction and geometric analysis. He is the founder of the Geometric Image Processing Lab. and a founder and advisor of several successful image processing and analysis companies.
— Résumé —
Geometry means understanding in the sense that it involves finding the most basic invariants or Ockham’s razor explanation for a given phenomenon. At the other end, modern Machine Learning has little to do with explanation or interpretation of solutions to a given problem.
I’ll try to give some examples about the relation between learning and geometry, focusing on learning geometry, starting with the most basic notion of planar shape invariants, efficient distance computation on surfaces, and treating surfaces as metric spaces within a deep learning framework. I will introduce some links between these two seemingly orthogonal philosophical directions.
13th PRAIRIE seminar – 8 September at 14h
Connection link: https://global.gotomeeting.com/join/834058333
Speaker: Christel Daniel
Title: Innovation through healthcare data at Greater Paris University Hospital (AP-HP)
—- Bio —-
Pathologist (MD) with PhD in biomedical informatics, associate director at Assistance Publique – Hôpitaux de Paris (AP-HP) in charge of AP-HP clinical terminologies and of data-driven research and innovation (reuse of AP-HP real-world big data (AP-HP Clinical Data Repository (CDR), https://eds.aphp.fr) and clinical research data). Primary areas of research are clinical informatics, clinical research informatics, semantic interoperability. Past co-chair of IHE Anatomic Pathology domain. Member of DICOM WG26, HL7 France, HL7 Pathology SIG and CDISC France. Co-editor of the Clinical Research Informatics section of the IMIA yearbook.
— Résumé —
Greater Paris University Hospital (AP-HP) is a globally recognized university hospital center with a European dimension welcoming more than 10 million patients in its 39 hospitals: in consultation, in emergency, during scheduled hospitalizations or in hospitalization at home. AP-HP is committed to a proactive policy of accelerating the use of clinical data collected during clinical care. Developing AI-powered decision aids is one of the major component towards Learning Health System: a system able to learn and improve from its data. With the constant concern of improving the health and well-being of citizens, the challenge is to integrate to promote digital innovations with demonstrated impact on clinical outcomes at an acceptable cost. The directions of Clinical Research and Innovation and of Information System are offering tools and services to a broad set of users supporting piloting, research and innovation activities. Supported by an institutional secured and high-performance cloud, the AP-HP data space integrates a large amount of massive healthcare data collected during both routine clinical care and research activities that can be leveraged for secondary use. The major component of the AP-HP data space is the AP-HP Clinical Data Warehouse (CDW) (https:// eds.aphp.fr), first CDW authorized by the French Data Protection Authority, enabling the processing of deidentified health data from more than 10 million patients to facilitate research, improve the health system, make it more efficient and more personalized. More than 130 research projects, authorized by the AP-HP Institutional Review Board, have been conducted or are running on the AP-HP healthcare data (observational studies, development and external validation of AI/ML algorithms) including 63 projects related to the COVID-19 pandemic. New services aiming at leveraging EHR data to accelerate Clinical Research with EHR data are under construction.
12th PRAIRIE seminar – 12 July at 14h
Speaker: Christian Robert
Title: From Geyer’s reverse logistic regression to GANs, a statistician tale on normalising constants
—- Bio —-
Professor at Université Paris Dauphine since 2000, part-time professor at University of Warwick (Fall 2013- ), fellow of the ASA (2012) and the IMS (1996), former editor of the Journal of the Royal Statistical Society (2006-2010) and deputy editor of Biometrika (2018-), senior member of Institut Universitaire de France (2010-2021)
— Résumé —
The problem of unknown normalising constants has been a long-standing issue in statistics and in particular Bayesian statistics. While many simulation based proposals have been made to address this issue, a class of methods stands out as relying on statistical representations to produce estimators of these normalising constants, along with uncertainty quantification. The starting point is Geyer’s (1994) reverse logistic regression, which proves highly efficient and robust to the curse of dimension. It relates to later Monte Carlo methods like bridge sampling and multiple mixtures, as statistical and learning principles such as non-parametric MLE, noise contrastive estimation (NCE), and generative adversarial networks (GANs).
[This talk is based on on-going, joint, work with Jean-Michel Marin and Judith Rousseau.]
11th PRAIRIE seminar – 16 June at 14h
Speaker: Max Welling, University of Amsterdam
Title: Unsupervised Learning of Equivariant Space-Time Capsules
—- Bio —-
Prof. Dr. Max Welling is a research chair in Machine Learning at the University of Amsterdam and a VP Technologies at Qualcomm. He has a secondary appointment as a fellow at the Canadian Institute for Advanced Research (CIFAR). Max Welling has served as associate editor in chief of IEEE TPAMI from 2011-2015. He serves on the board of the Neurips foundation since 2015 and has been program chair and general chair of Neurips in 2013 and 2014 respectively. He was also program chair of AISTATS in 2009 and ECCV in 2016 and general chair of MIDL 2018. He is a founding board member of ELLIS. Max Welling is recipient of the ECCV Koenderink Prize in 2010. He directs the Amsterdam Machine Learning Lab (AMLAB) and co-directs the Qualcomm-UvA deep learning lab (QUVA) and the Bosch-UvA Deep Learning lab (DELTA). He is a fellow and founding board member of the European Lab for learning and Intelligent systems (ELLIS).
— Résumé —
Equivariance is an organizing principle in deep learning that expresses how internal representation should behave under symmetry transformations. To learn equivariant neural networks, we usually must know the representation theory for the symmetry group under consideration. This raises the question, can this structure also be learned completely unsupervised. In this talk I will argue that we can use a connection between topographic representations (like the ones developed in topographic ICA) with the notion of equivariant capsules. Capsules also organize representations such that nearby filters in the topographic map are similar. This means that as we observe a stimulus over time, we expect that the activations change smoothly and slowly through this “neural space-time”. By structuring these representations as circular capsules, internal representations behave as oscillators (one oscillator per capsule), and we can predict the future by rolling forward activated oscillators. If time allows, I will try to make a connection to quantum field theory and Hinton particles inside neural networks which end up being quantum excitations of these space-time capsule oscillators.
10th PRAIRIE seminar – 12 May at 14h
Speaker: Raphaël Porcher
Title: Population benefit and practical implementation of individualized treatment strategies
—- Bio —-
Associate Professor of Biostatistics at Université de Paris, co-director of the Centre Virchow-Villermé Paris Berlin, and member of the METHODS team of CRESS-UMR1153. Member of the Comité d’Evaluation Ethique / Institutional Review Board of Inserm. Senior Associate Editor for Methods at Clinical Orthopaedics and Related Research, and Associate Editor for Statistics, Artificial Intelligence and Modeling Outcomes at the Journal of Hepatology.
— Résumé —
In the last years, numerous methods have been developed to estimate individualized treatment effects, and associated individualized treatment rules (ITRs), allowing to identify who benefits more from one treatment or another, which is at the core of personalized or precision medicine. Approaches range from the use of traditional risk prediction models to estimate individualized treatment effects in a counterfactual framework to sophisticated machine learning approaches targeting the individualized treatment effects or directly learning the ITR. Moreover, interest (and methods) are switching to so-called dynamic treatment regimes (DTRs), where the issue is not only who benefits but when (e.g. starting or stopping a treatment).
In this talk, we will present the counterfactual framework and points to consider when developing ITRs. We will then discuss issues on how to estimate the population benefit of an ITR, and develop more on how to account for the implementation or adoption of ITRs in practice, using practical examples. Last, we will briefly discuss DTRs.
9th Prairie seminar – 14 April 2021, at 14h (webinar)
Speaker: Hélène Dessales (ENS – PSL)
Title: From 3D to 4D modelling in archaeology: an application in Roman Pompeii
—- Bio —
Hélène Dessales is lecturer in Roman archaeology at the Ecole normale supérieure, in the Département des Sciences de l’Antiquité, as a member of the AOROC research unit. She was fellow of the Ecole française de Rome and junior member of the Institut Universitaire de France. Her research focuses on building techniques in the Roman world. She is also a specialist in the history of archaeology, studying graphical archives of the 19th century, through the corpus of the “Grand Tour” travellers in Italy. She has led several field missions, in France, Spain and Italy. In Pompeii in particular, she has recently coordinated various research projects (PSL structuring program – Pompeii 3D; ANR RECAP – Rebuilding after an earthquake: ancient experiences and innovations in Pompeii) and published a volume on a significant monument (The Villa of Diomedes. The making of a Roman villa in Pompeii, Paris, 2020).
— Résumé —
3D modelling in archaeology and architectural studies are both a research tool and an important medium for dissemination to the public. During the last decade, the role of computer vision and photogrammetry have developed strongly and changed the practices of archaeological surveys and drawings. The purpose of this talk is to explore the challenges of 4D visualization, through a case study in Pompeii. Indeed, the three spatial dimensions of the virtual space integrate time as a fourth dimension. The distinctive feature of Pompeii is to allow to trace the various building stages back to Roman times, from the first phases of the urban settlement to the eruption of Vesuvius, but also to integrate the evolution and the restoration of the archaeological site, since its discovery at the end of the eighteenth century until today. In this way, the 4D model functions as a veritable time machine and is implemented as a scientific research tool to interpret the archaeological data.
8th Prairie seminar – 10 March 2021, at 14h (webinar)
Speaker: Jean-Paul Laumond
Title: Nonholonomic motion: from the rolling car to the rolling man
=== Bio ===
Directeur de Recherche CNRS (DRCE2), President-CEO Kineo CAM (2000-2002), IEEE Fellow (2007), Professor at Collège de France (2011-2012), ERC Adv. Grant (2014-2018), Member of Academy of Technology (2015), IEEE Inaba Technical Award for Innovation Leading to Production (2016), Member of Academy of Sciences (2017).
==== Résumé ====
The purpose of the presentation is to show how from the 1990s robotics has integrated techniques from geometrical control (optimal control, differential flatness) to automate the computation of the movements of mobile robots subjected to the nonholonomic constraint of rolling without slipping. We will present both the problems solved and the questions still open today. In a second step, we take a pluridisciplinary perspective combining robotics, neurophysiology and biomechanics to better understand the geometry of bipedal locomotion.
7th Prairie seminar – 10 February 2021, at 14h (webinar)
Speaker: Joan Bruna, New York University
Title: Mathematical aspects of neural network approximation and learning
=== Bio ===
Joan Bruna is an Assistant Professor at Courant Institute, New York University (NYU), in the Department of Computer Science, Department of Mathematics (affiliated) and the Center for Data Science, since Fall 2016. He belongs to the CILVR group and to the Math and Data groups. From 2015 to 2016, he was Assistant Professor of Statistics at UC Berkeley and part of BAIR (Berkeley AI Research). Before that, he worked at FAIR (Facebook AI Research) in New York. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU. He completed his PhD in 2013 at Ecole Polytechnique, France. Before his PhD he was a Research Engineer at a semi-conductor company, developing real-time video processing algorithms. Even before that, he did a MsC at Ecole Normale Superieure de Cachan in Applied Mathematics (MVA) and a BA and MS at UPC (Universitat Politecnica de Catalunya, Barcelona) in both Mathematics and Telecommunication Engineering. For his research contributions, he has been awarded a Sloan Research Fellowship (2018), a NSF CAREER Award (2019) and a best paper award at ICMLA (2018).
==== Résumé ====
High-dimensional learning remains an outstanding phenomena where experimental evidence outpaces our current mathematical understanding, mostly due to the recent empirical successes of Deep Learning. Neural Networks provide a rich yet intricate class of functions with statistical abilities to break the curse of dimensionality, and where physical priors can be tightly integrated into the architecture to improve sample efficiency. Despite these advantages, an outstanding theoretical challenge in these models is computational, by providing an analysis that explains successful optimization and generalization in the face of existing worst-case computational hardness results.
In this talk, we will describe snippets of such challenge, covering respectively optimization and approximation. First, we will focus on the framework that lifts parameter optimization to an appropriate measure space. We will overview existing results that guarantee global convergence of the resulting Wasserstein gradient flows, and present our recent results that study typical fluctuations of the dynamics around their mean field evolution, as well as extensions of this framework beyond vanilla supervised learning to account for symmetries in the function and in competitive optimization. Next, we will discuss the role of depth in terms of approximation, and present novel results establishing so-called ‘depth separation’ for a broad class of functions. We will conclude by discussing consequences in terms of optimization, highlighting current and future mathematical challenges.
6th Prairie seminar – 13 January 2021, at 11h CET (webinar)
Speaker: Béatrice Joyeux-Prunel, University of Geneva
Title: Data Science applied to Visual Globalization. The project Visual Contagions
=== Bio ===
Béatrice Joyeux-Prunel is full Professor at the University of Geneva in Switzerland, as chair for Digital Humanities (dh.unige.ch). She leads the FNS Project Visual Contagions; and the IMAGO Centre at the École Normale Supérieure, a center dedicated to teaching, research and creation on the circulation of images in Europe (www.imago.ens.fr). In 2009, Joyeux-Prunel founded Artl@s, a platform that federates several international research projects on the globalization of art and images and digital approaches. She works on the social and global history of modern art, on visual globalization, the Digital Humanities, and the visual history of petroleum. Among her publications : Béatrice Joyeux-Prunel (ed.) with the collaboration of Luc Sigalo-Santos, L’art et la mesure. Histoire de l’art et méthodes quantitatives: sources, outils, bonnes pratiques (ed. Rue d’Ulm, 2010); Catherine Dossin, Thomas DaCosta Kaufmann, and Béatrice Joyeux-Prunel (ed.), Circulations in the Global History of a Art (Routledge, 2015). And as sole author : Les avant-gardes artistiques – une histoire transnationale 1848-1918 (Gallimard Folio histoire pocket series, 2016) ; Les avant-gardes artistiques – une histoire transnationale 1918-1945 (Gallimard Folio histoire pocket series, 2017) ; and Naissance de l’art contemporain (1945-1970) – Une histoire mondiale (Editions du CNRS, 2021).
==== Résumé ====
Images are the somewhat sickly child of globalization studies. We know that they have conveyed and still convey behavioural models, representations and values that participate in the cultural homogenization by which globalization is most often identified. But we are quite incapable of explaining how this homogenization has taken place; which images have circulated or been imitated the most in the past; according to which social, cultural, geographic channels; what were their success factors; and whether there has been more homogenization than fabrication of heterogeneity in the global circulation of images.
Data science can be very useful in trying to answer these questions, or at least to provide hypotheses about image-based globalization. The Visual Contagion project (Swiss National Science Foundation, 2021-2025) and the Imago Center (Label ERC European Center of Excellence Jean Monnet, ENS/Beaux-Arts de Paris, 2019-2022) are interested in these questions. The particular case of the age of the illustrated print makes it possible to study the matter over the long period (1890-1990), and on a global scale, since we have an unprecedented quantity of digitized sources. What remains is to establish a workflow that would be as relevant as possible – which brings decisive issues for the digital humanities: how to organize the infrastructure for hosting and retrieving our images, so as not to re-host data already made available by others? How can we minimize the computing time of our algorithms? Can we envisage pattern descriptions that are interoperable and can be exchanged between projects that apply the same pattern recognition methods? Once the images have been described, how can we visualize their circulation in time, space, social and cultural environments? What interpretations can then be made of the results obtained?
5th Prairie seminar – 18 November 2020, at 11h CET (webinar)
Speaker: Marta Ruiz Costa-Jussa, Universitat Politècnica de Catalunya
Title: Insights in (Spoken) Multilingual Machine Translation: examining Continuous Learning and Fairness.
=== Bio ===
Marta R. Costa-Jussà is a Ramon y Cajal Researcher at the Universitat Politècnica de Catalunya (UPC, Barcelona). She received her PhD from the UPC in 2008. Her research experience is mainly in Machine Translation. She has worked at LIMSI-CNRS (Paris), Barcelona Media Innovation Center, Universidade de São Paulo, Institute for Infocomm Research (Singapore), Instituto Politécnico Nacional (Mexico) and the University of Edinburgh. Recently, she has received an ERC Starting Grant 2020 and two Google Faculty Research Awards (2018 and 2019).
==== Résumé ====
Multilingual Machine Translation is at the core of social communication. In everyday situations, we rely on free commercial services. These systems have improved their quality thanks to the use of deep learning techniques. Despite the considerable progress that machine translation is making, why do we still see that translation quality is much better in English to Portuguese than between spoken Dutch and Catalan? In addition to this, there are demographic biases widely affecting our systems e.g., from poorer speech recognition for women than for men to stereotyped translations, why neutral words as “doctor” tend to infer the “male” gender when translated into a language that requires gender flexion for this word?
In this talk, we will give some profound insights into (spoken) multilingual machine translation pursuing similar quality for all languages and allowing for incremental addition of new languages. Moreover, we will give details on the fairness challenge, focusing on producing multilingual balanced data in terms of gender; working towards transparency; and debiasing algorithms.
4th Prairie seminar – 16 September 2020, at 11h CET (webinar)
Speaker: Éric Moulines, École Polytechnique
Title: “MCMC, Variational Inference, Invertible Flows… Bridging the gap?”
==== Résumé ====
Variational Autoencoders (VAE) — generative models combining variational inference and autoencoding — have found widespread applications to learn latent representations for high-dimensional observations. However, most VAEs, relying on simple mean-field variational distributions, usually suffer from somewhat limited expressiveness, which results in a poor approximation of the conditional latent distribution and in particular mode dropping. In this work, we propose Metropolized VAE (MetVAE), a VAE approach based on a new class of variational distributions enriched with Markov Chain Monte Carlo. We develop a specific instance of MetVAE with Hamiltonian Monte Carlo and demonstrate clear improvements of the latent distribution approximations at the cost of a moderate increase of the computational cost. We consider application to probabilistic collaborative filtering models, and numerical experiments on classical benchmarks support the performance of MetVAE.
3rd Prairie seminar – 9 June 2020, at 14h CET (webinar)
Speaker: Alex Cristia, Laboratoire de Sciences Cognitives et Psycholinguistique, Département d’études cognitives ENS, EHESS, Centre National de la Recherche Scientifique PSL Research University, https://sites.google.com/site/acrsta/
Title: « Unsupervised learning of sounds and words: Is it easier from child-directed speech? »
==== Résumé ====
Developments in recent years have sometimes led to systems that can achieve super-human performance even in tasks previously thought to require human cognition. As of today, however, humans remain simply unsurpassable in the domain native language acquisition. Children routinely become fluent in one or more languages by about 4 years of age, after exposure to possibly as little as 500h, and maximally 8k hours of speech. In stark contrast, the best speech recognition and natural language processing systems on the market today require up to 100 times those quantities of input to achieve a level of performance that is substantially lower than that of humans, often having to employ at least some labeled data. It has been argued that infants’ acquisition is aided by cooperative tutors: Child-directed speech may be simplified in ways that boost learning. In this talk, I present results from several studies assessing the learnability of speech sounds and words from child- versus adult-directed speech. I demonstrate that learnability is increased in input to children only when we assume the learner has access to representations that abstract from the acoustic signal; when presented with acoustic speech features, however, learnability is lower for child- than adult-directed speech. These results suggest present-day machines are unlikely to benefit from infant-directed input, unless we improve our acoustic representations of speech.
2nd Prairie seminar – 6 May 2020 (webinar)
Speaker: Marc Mézard, Ecole normale supérieure – Université PSL
Title: « L’éclairage de la physique statistique sur quelques questions d’apprentissage machine »
==== Résumé ====
Depuis plus de trente ans, il y a eu un certain nombre de tentatives pour utiliser des concepts et méthodes de physique statistique afin de développer un cadre théorique pour l’apprentissage machine, avec des succès mitigés. Cette direction de recherche a été revivifiée récemment, autour des questions ouvertes importantes posées dans le cadre des développements récents du « deep learning », notamment des questions liées à la dynamique des algorithmes d’apprentissage et à la structure des données.
Cet exposé présentera certains de ces développements récents, dans une perspective globale, en soulignant les forces et les faiblesses de telles approches.
1st Prairie seminar – 5 February 2020
Speaker: Jean François Cardoso, CNRS et Institut d’Astrophysique de Paris (http://www2.iap.fr/users/cardoso/)
Title: « Information geometry of Independent Component Analysis »
==== Résumé ====
Independent Component Analysis is an exploratory technique which, as its name implies, aims at decomposing a vector of observations into components which are statistically independent (or as independent as possible). It has numerous applications, particularly in neurosciences for extracting brain sources from their observed mixtures collected on the scalp.
ICA goes well beyond PCA (Principal Component Analysis) because statistical independence is a much stronger property than mere decorrelation. Of course, this program implies that an ICA method must use non Gaussian statistics in order to express independence (otherwise, independence would reduce decorrelation).
In this (non technical) seminar, I use a simple construction of Information Geometry (a Pythagorean theorem in distribution space) to elucidate the connections in ICA between the main players: correlation, independence, non Gaussianity, mutual information and entropy.