Colloquium PRAIRIE

Colloquium PRAIRIE

To receive PRAIRIE news and colloquium announcements sign up for PRAIRIE mailing list.

(In case this automatic link does not work, send an email to sympa@inria.fr with the subject: subscribe prairie_news [your e-mail address].)

Connection link: https://u-paris.zoom.us/j/82231267433?pwd=SHl6YkpIM3ZFck5oNTN4UWR1dkRldz09

———————————————–

PAST SEMINARS

14th PRAIRIE seminar 6 October at 14h

Speaker: Ron Kimmel (Technion)

Title: « On Geometry and Learning »

—- Bio —-

Ron Kimmel is a Professor of Computer Science and Electrical & Computer Eng. (by courtesy) at the Technion where he holds the Montreal Chair in Sciences. He held a post-doctoral position at UC Berkeley and a visiting professorship at Stanford University. He has worked in various areas of shape reconstruction and analysis in computer vision, image processing, deep learning of big geometric data, and computer graphics. Kimmel’s interest in recent years has been understanding of machine learning, medical imaging and computational biometry, optimization of solvers to problems with a geometric flavor, and applications of metric, spectral, Riemannian, and differential geometries. Kimmel is an IEEE Fellow and SIAM Fellow for his contributions to image processing, shape reconstruction and geometric analysis. He is the founder of the Geometric Image Processing Lab. and a founder and advisor of several successful image processing and analysis companies.

— Résumé —

Geometry means understanding in the sense that it involves finding the most basic invariants or Ockham’s razor explanation for a given phenomenon. At the other end, modern Machine Learning has little to do with explanation or interpretation of solutions to a given problem.
I’ll try to give some examples about the relation between learning and geometry, focusing on learning geometry, starting with the most basic notion of planar shape invariants, efficient distance computation on surfaces, and treating surfaces as metric spaces within a deep learning framework. I will introduce some links between these two seemingly orthogonal philosophical directions.

Video



UPCOMING SEMINARS

13th PRAIRIE seminar – 8 September at 14h

Connection link: https://global.gotomeeting.com/join/834058333

Speaker: Christel Daniel

Title: Innovation through healthcare data at Greater Paris University Hospital (AP-HP)

—- Bio —-

Pathologist (MD) with PhD in biomedical informatics, associate director at Assistance Publique – Hôpitaux de Paris (AP-HP) in charge of AP-HP clinical terminologies and of data-driven research and innovation (reuse of AP-HP real-world big data (AP-HP Clinical Data Repository (CDR), https://eds.aphp.fr) and clinical research data). Primary areas of research are clinical informatics, clinical research informatics, semantic interoperability. Past co-chair of IHE Anatomic Pathology domain. Member of DICOM WG26, HL7 France, HL7 Pathology SIG and CDISC France. Co-editor of the Clinical Research Informatics section of the IMIA yearbook.

— Résumé —

Greater Paris University Hospital (AP-HP) is a globally recognized university hospital center with a European dimension welcoming more than 10 million patients in its 39 hospitals: in consultation, in emergency, during scheduled hospitalizations or in hospitalization at home. AP-HP is committed to a proactive policy of accelerating the use of clinical data collected during clinical care. Developing AI-powered decision aids is one of the major component towards Learning Health System: a system able to learn and improve from its data. With the constant concern of improving the health and well-being of citizens, the challenge is to integrate to promote digital innovations with demonstrated impact on clinical outcomes at an acceptable cost. The directions of  Clinical Research and Innovation and of Information System are offering tools and services to a broad set of users supporting piloting, research and innovation activities. Supported by an institutional secured and high-performance cloud, the AP-HP data space integrates a large amount of massive healthcare data collected during both routine clinical care and research activities that can be leveraged for secondary use. The major component of the AP-HP data space is the AP-HP Clinical Data Warehouse (CDW) (https:// eds.aphp.fr), first CDW authorized by the French Data Protection Authority, enabling the processing of deidentified health data from more than 10 million patients to facilitate research, improve the health system, make it more efficient and more personalized. More than 130 research projects, authorized by the AP-HP Institutional Review Board, have been conducted or are running on the AP-HP healthcare data (observational studies, development and external validation of AI/ML algorithms) including 63 projects related to the COVID-19 pandemic. New services aiming at leveraging EHR data to accelerate Clinical Research with EHR data are under construction.

12th PRAIRIE seminar – 12 July at 14h

Speaker:  Christian Robert

Title: From Geyer’s reverse logistic regression to GANs, a statistician tale on normalising constants

—- Bio —-

Professor at Université Paris Dauphine since 2000, part-time professor at University of Warwick (Fall 2013- ), fellow of the ASA (2012) and the IMS (1996), former editor of the Journal of the Royal Statistical Society (2006-2010) and deputy editor of Biometrika (2018-), senior member of Institut Universitaire de France (2010-2021)

— Résumé —

The problem of unknown normalising constants has been a long-standing issue in statistics and in particular Bayesian statistics. While many simulation based proposals have been made to address this issue, a class of methods stands out as relying on statistical representations to produce estimators of these normalising constants, along with uncertainty quantification. The starting point is Geyer’s (1994) reverse logistic regression, which proves highly efficient and robust to the curse of dimension. It relates to later Monte Carlo methods like bridge sampling and multiple mixtures, as statistical and learning principles such as non-parametric MLE, noise contrastive estimation (NCE), and generative adversarial networks (GANs).

[This talk is based on on-going, joint, work with Jean-Michel Marin and Judith Rousseau.]

Video

11th PRAIRIE seminar – 16 June at 14h

Speaker: Max Welling, University of Amsterdam

Title: Unsupervised Learning of Equivariant Space-Time Capsules

—- Bio —-

Prof. Dr. Max Welling is a research chair in Machine Learning at the University of Amsterdam and a VP Technologies at Qualcomm. He has a secondary appointment as a fellow at the Canadian Institute for Advanced Research (CIFAR). Max Welling has served as associate editor in chief of IEEE TPAMI from 2011-2015. He serves on the board of the Neurips foundation since 2015 and has been program chair and general chair of Neurips in 2013 and 2014 respectively. He was also program chair of AISTATS in 2009 and ECCV in 2016 and general chair of MIDL 2018. He is a founding board member of ELLIS. Max Welling is recipient of the ECCV Koenderink Prize in 2010. He directs the Amsterdam Machine Learning Lab (AMLAB) and co-directs the Qualcomm-UvA deep learning lab (QUVA) and the Bosch-UvA Deep Learning lab (DELTA). He is a fellow and founding board member of the European Lab for learning and Intelligent systems (ELLIS).

— Résumé —

Equivariance is an organizing principle in deep learning that expresses how internal representation should behave under symmetry transformations. To learn equivariant neural networks, we usually must know the representation theory for the symmetry group under consideration. This raises the question, can this structure also be learned completely unsupervised. In this talk I will argue that we can use a connection between topographic representations (like the ones developed in topographic ICA) with the notion of equivariant capsules. Capsules also organize representations such that nearby filters in the topographic map are similar. This means that as we observe a stimulus over time, we expect that the activations change smoothly and slowly through this “neural space-time”. By structuring these representations as circular capsules, internal representations behave as oscillators (one oscillator per capsule), and we can predict the future by rolling forward activated oscillators. If time allows, I will try to make a connection to quantum field theory and Hinton particles inside neural networks which end up being quantum excitations of these space-time capsule oscillators.

Video

======================================

10th PRAIRIE seminar – 12 May at 14h

Speaker:  Raphaël Porcher

Title: Population benefit and practical implementation of individualized treatment strategies

—- Bio —-

Associate Professor of Biostatistics at Université de Paris, co-director of the Centre Virchow-Villermé Paris Berlin, and member of the METHODS team of CRESS-UMR1153. Member of the Comité d’Evaluation Ethique / Institutional Review Board of Inserm. Senior Associate Editor for Methods at Clinical Orthopaedics and Related Research, and Associate Editor for Statistics, Artificial Intelligence and Modeling Outcomes at the Journal of Hepatology.

— Résumé —

In the last years, numerous methods have been developed to estimate individualized treatment effects, and associated individualized treatment rules (ITRs), allowing to identify who benefits more from one treatment or another, which is at the core of personalized or precision medicine. Approaches range from the use of traditional risk prediction models to estimate individualized treatment effects in a counterfactual framework to sophisticated machine learning approaches targeting the individualized treatment effects or directly learning the ITR. Moreover, interest (and methods) are switching to so-called dynamic treatment regimes (DTRs), where the issue is not only who benefits but when (e.g. starting or stopping a treatment).

In this talk, we will present the counterfactual framework and points to consider when developing ITRs. We will then discuss issues on how to estimate the population benefit of an ITR, and develop more on how to account for the implementation or adoption of ITRs in practice, using practical examples. Last, we will briefly discuss DTRs.

Video

================================================

9th Prairie seminar – 14 April 2021, at 14h (webinar)

Speaker: Hélène Dessales (ENS – PSL)

Title: From 3D to 4D modelling in archaeology: an application in Roman Pompeii

—- Bio —

Hélène Dessales is lecturer in Roman archaeology at the Ecole normale supérieure, in the Département des Sciences de l’Antiquité, as a member of the AOROC research unit. She was fellow of the Ecole française de Rome and junior member of the Institut Universitaire de France. Her research focuses on building techniques in the Roman world. She is also a specialist in the history of archaeology, studying graphical archives of the 19th century, through the corpus of the “Grand Tour” travellers in Italy. She has led several field missions, in France, Spain and Italy. In Pompeii in particular, she has recently coordinated various research projects (PSL structuring program – Pompeii 3D; ANR RECAP – Rebuilding after an earthquake: ancient experiences and innovations in Pompeii) and published a volume on a significant monument (The Villa of Diomedes. The making of a Roman villa in Pompeii, Paris, 2020).

— Résumé —

3D modelling in archaeology and architectural studies are both a research tool and an important medium for dissemination to the public. During the last decade, the role of computer vision and photogrammetry have developed strongly and changed the practices of archaeological surveys and drawings. The purpose of this talk is to explore the challenges of 4D visualization, through a case study in Pompeii. Indeed, the three spatial dimensions of the virtual space integrate time as a fourth dimension.  The distinctive feature of Pompeii is to allow to trace the various building stages back to Roman times, from the first phases of the urban settlement to the eruption of Vesuvius, but also to integrate the evolution and the restoration of the archaeological site, since its discovery at the end of the eighteenth century until today. In this way, the 4D model functions as a veritable time machine and is implemented as a scientific research tool to interpret the archaeological data.

Video

=================================================

8th Prairie seminar – 10 March 2021, at 14h (webinar)

Speaker: Jean-Paul Laumond

Title: Nonholonomic motion: from the rolling car to the rolling man

=== Bio ===

Directeur de Recherche CNRS (DRCE2), President-CEO Kineo CAM (2000-2002), IEEE Fellow (2007), Professor at Collège de France (2011-2012), ERC Adv. Grant (2014-2018), Member of Academy of Technology (2015), IEEE Inaba Technical Award for Innovation Leading to Production (2016), Member of Academy of Sciences (2017).

==== Résumé ====

The purpose of the presentation is to show how from the 1990s robotics has integrated techniques from geometrical control (optimal control, differential flatness) to automate the computation of the movements of mobile robots subjected to the nonholonomic constraint of rolling without slipping. We will present both the problems solved and the questions still open today. In a second step, we take a pluridisciplinary perspective combining robotics, neurophysiology and biomechanics to better understand the geometry of bipedal locomotion.

Video

================================================

7th Prairie seminar – 10 February 2021, at 14h (webinar)

Speaker: Joan Bruna, New York University

Title: Mathematical aspects of neural network approximation and learning

=== Bio ===

Joan Bruna is an Assistant Professor at Courant Institute, New York University (NYU), in the Department of Computer Science, Department of Mathematics (affiliated) and the Center for Data Science, since Fall 2016. He belongs to the CILVR group and to the Math and Data groups. From 2015 to 2016, he was Assistant Professor of Statistics at UC Berkeley and part of BAIR (Berkeley AI Research). Before that, he worked at FAIR (Facebook AI Research) in New York. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU. He completed his PhD in 2013 at Ecole Polytechnique, France. Before his PhD he was a Research Engineer at a semi-conductor company, developing real-time video processing algorithms. Even before that, he did a MsC at Ecole Normale Superieure de Cachan in Applied Mathematics (MVA) and a BA and MS at UPC (Universitat Politecnica de Catalunya, Barcelona) in both Mathematics and Telecommunication Engineering. For his research contributions, he has been awarded a Sloan Research Fellowship (2018), a NSF CAREER Award (2019) and a best paper award at ICMLA (2018).

==== Résumé ====

High-dimensional learning remains an outstanding phenomena where experimental evidence outpaces our current mathematical understanding, mostly due to the recent empirical successes of Deep Learning. Neural Networks provide a rich yet intricate class of functions with statistical abilities to break the curse of dimensionality, and where physical priors can be tightly integrated into the architecture to improve sample efficiency. Despite these advantages, an outstanding theoretical challenge in these models is computational, by providing an analysis that explains successful optimization and generalization in the face of existing worst-case computational hardness results.

In this talk, we will describe snippets of such challenge, covering respectively optimization and approximation. First, we will focus on the framework that lifts parameter optimization to an appropriate measure space. We will overview existing results that guarantee global convergence of the resulting Wasserstein gradient flows, and present our recent results that study typical fluctuations of the dynamics around their mean field evolution, as well as extensions of this framework beyond vanilla supervised learning to account for symmetries in the function and in competitive optimization. Next, we will discuss the role of depth in terms of approximation, and present novel results establishing so-called ‘depth separation’ for a broad class of functions. We will conclude by discussing consequences in terms of optimization, highlighting current and future mathematical challenges.

Video

================================================

6th Prairie seminar – 13 January 2021, at 11h CET (webinar)

Speaker: Béatrice Joyeux-Prunel, University of Geneva

Title: Data Science applied to Visual Globalization. The project Visual Contagions

=== Bio ===

Béatrice Joyeux-Prunel is full Professor at the University of Geneva in Switzerland, as chair for Digital Humanities (dh.unige.ch). She leads the FNS Project Visual Contagions; and the IMAGO Centre at the École Normale Supérieure, a center dedicated to teaching, research and creation on the circulation of images in Europe (www.imago.ens.fr). In 2009, Joyeux-Prunel founded Artl@s, a platform that federates several international research projects on the globalization of art and images and digital approaches. She works on the social and global history of modern art, on visual globalization, the Digital Humanities, and the visual history of petroleum. Among her publications : Béatrice Joyeux-Prunel (ed.) with the collaboration of Luc Sigalo-Santos, L’art et la mesure. Histoire de l’art et méthodes quantitatives: sources, outils, bonnes pratiques (ed. Rue d’Ulm, 2010); Catherine Dossin, Thomas DaCosta Kaufmann, and Béatrice Joyeux-Prunel (ed.), Circulations in the Global History of a Art (Routledge, 2015). And as sole author : Les avant-gardes artistiques – une histoire transnationale 1848-1918 (Gallimard Folio histoire pocket series, 2016) ; Les avant-gardes artistiques – une histoire transnationale 1918-1945  (Gallimard Folio histoire pocket series, 2017) ; and Naissance de l’art contemporain (1945-1970) – Une histoire mondiale (Editions du CNRS, 2021).

==== Résumé ====

Images are the somewhat sickly child of globalization studies. We know that they have conveyed and still convey behavioural models, representations and values that participate in the cultural homogenization by which globalization is most often identified. But we are quite incapable of explaining how this homogenization has taken place; which images have circulated or been imitated the most in the past; according to which social, cultural, geographic channels; what were their success factors; and whether there has been more homogenization than fabrication of heterogeneity in the global circulation of images. 

Data science can be very useful in trying to answer these questions, or at least to provide hypotheses about image-based globalization. The Visual Contagion project (Swiss National Science Foundation, 2021-2025) and the Imago Center (Label ERC European Center of Excellence Jean Monnet, ENS/Beaux-Arts de Paris, 2019-2022) are interested in these questions. The particular case of the age of the illustrated print makes it possible to study the matter over the long period (1890-1990), and on a global scale, since we have an unprecedented quantity of digitized sources. What remains is to establish a workflow that would be as relevant as possible – which brings decisive issues for the digital humanities: how to organize the infrastructure for hosting and retrieving our images, so as not to re-host data already made available by others? How can we minimize the computing time of our algorithms?  Can we envisage pattern descriptions that are interoperable and can be exchanged between projects that apply the same pattern recognition methods? Once the images have been described, how can we visualize their circulation in time, space, social and cultural environments? What interpretations can then be made of the results obtained?

Presentation

Video

================================================

5th Prairie seminar – 18 November 2020, at 11h CET (webinar)

Speaker: Marta Ruiz Costa-Jussa, Universitat Politècnica de Catalunya

Title: Insights in (Spoken) Multilingual Machine Translation: examining Continuous Learning and Fairness.

=== Bio ===

Marta R. Costa-Jussà is a Ramon y Cajal Researcher at the Universitat Politècnica de Catalunya (UPC, Barcelona). She received her PhD from the UPC in 2008. Her research experience is mainly in Machine Translation. She has worked at LIMSI-CNRS (Paris), Barcelona Media Innovation Center, Universidade de São PauloInstitute for Infocomm Research (Singapore), Instituto Politécnico Nacional (Mexico) and the University of Edinburgh. Recently, she has received an ERC Starting Grant 2020 and two Google Faculty Research Awards (2018 and 2019).

==== Résumé ====
Multilingual Machine Translation is at the core of social communication. In everyday situations, we rely on free commercial services. These systems have improved their quality thanks to the use of deep learning techniques. Despite the considerable progress that machine translation is making, why do we still see that translation quality is much better in English to Portuguese than between spoken Dutch and Catalan? In addition to this, there are demographic biases widely affecting our systems e.g., from poorer speech recognition for women than for men to stereotyped translations, why neutral words as “doctor” tend to infer the “male” gender when translated into a language that requires gender flexion for this word?

In this talk, we will give some profound insights into (spoken) multilingual machine translation pursuing similar quality for all languages and allowing for incremental addition of new languages. Moreover, we will give details on the fairness challenge, focusing on producing multilingual balanced data in terms of gender; working towards transparency; and debiasing algorithms.

Video

================================================

4th Prairie seminar – 16 September 2020, at 11h CET (webinar)

Speaker: Éric Moulines, École Polytechnique

Title: “MCMC, Variational Inference, Invertible Flows… Bridging the gap?”

==== Résumé ====
Variational Autoencoders (VAE) — generative models combining variational inference and autoencoding — have found widespread applications to learn latent representations for high-dimensional observations. However, most VAEs, relying on simple mean-field variational distributions, usually suffer from somewhat limited expressiveness, which results in a poor approximation of the conditional latent distribution and in particular mode dropping. In this work, we propose Metropolized VAE (MetVAE), a VAE approach based on a new class of variational distributions enriched with Markov Chain Monte Carlo. We develop a specific instance of MetVAE with Hamiltonian Monte Carlo and demonstrate clear improvements of the latent distribution approximations at the cost of a moderate increase of the computational cost. We consider application to probabilistic collaborative filtering models, and numerical experiments on classical benchmarks support the performance of MetVAE.

Video

================================================

3rd Prairie seminar – 9 June 2020, at 14h CET (webinar)

Speaker: Alex Cristia, Laboratoire de Sciences Cognitives et Psycholinguistique, Département d’études cognitives ENS, EHESS, Centre National de la Recherche Scientifique PSL Research University, https://sites.google.com/site/acrsta/

Title: « Unsupervised learning of sounds and words: Is it easier from child-directed speech? »

==== Résumé ====
Developments in recent years have sometimes led to systems that can achieve super-human performance even in tasks previously thought to require human cognition. As of today, however, humans remain simply unsurpassable in the domain native language acquisition. Children routinely become fluent in one or more languages by about 4 years of age, after exposure to possibly as little as 500h, and maximally 8k hours of speech. In stark contrast, the best speech recognition and natural language processing systems on the market today require up to 100 times those quantities of input to achieve a level of performance that is substantially lower than that of humans, often having to employ at least some labeled data. It has been argued that infants’ acquisition is aided by cooperative tutors: Child-directed speech may be simplified in ways that boost learning. In this talk, I present results from several studies assessing the learnability of speech sounds and words from child- versus adult-directed speech. I demonstrate that learnability is increased in input to children only when we assume the learner has access to representations that abstract from the acoustic signal; when presented with acoustic speech features, however, learnability is lower for child- than adult-directed speech. These results suggest present-day machines are unlikely to benefit from infant-directed input, unless we improve our acoustic representations of speech.

Video

================================================

2nd Prairie seminar – 6 May 2020 (webinar)

Speaker: Marc Mézard, Ecole normale supérieure – Université PSL

Title: « L’éclairage de la physique statistique sur quelques questions d’apprentissage machine »

==== Résumé ====

Depuis plus de trente ans, il y a eu un certain nombre de tentatives pour utiliser des concepts et méthodes de physique statistique afin de développer un cadre théorique pour l’apprentissage machine, avec des succès mitigés. Cette direction de recherche a été revivifiée récemment, autour des questions ouvertes importantes posées dans le cadre des développements récents du « deep learning », notamment des questions liées à la dynamique des algorithmes d’apprentissage et à la structure des données.

Cet exposé présentera certains de ces développements récents, dans une perspective globale, en soulignant les forces et les faiblesses de telles approches.

Presentation

Video

================================================

1st Prairie seminar – 5 February 2020

Speaker: Jean François Cardoso, CNRS et Institut d’Astrophysique de Paris (http://www2.iap.fr/users/cardoso/)

Title: « Information geometry of Independent Component Analysis »

==== Résumé ====
Independent Component Analysis is an exploratory technique which, as its name implies, aims at decomposing a vector of observations into components which are statistically independent (or as independent as possible).  It has numerous applications, particularly in neurosciences for extracting brain sources from their observed mixtures collected on the scalp.

ICA goes well beyond PCA (Principal Component Analysis) because statistical independence is a much stronger property than mere decorrelation.  Of course, this program implies that an ICA method must use non Gaussian statistics in order to express independence (otherwise, independence would reduce decorrelation).

In this (non technical) seminar, I use a simple construction of Information Geometry (a Pythagorean theorem in distribution space) to elucidate the connections in ICA between the main players: correlation, independence, non Gaussianity, mutual information and entropy.

Presentation