Colloquium PR[AI]RIE

Insights in (Spoken) Multilingual Machine Translation: examining Continuous Learning and Fairness

18/11/2021

11h

Speaker: Marta Ruiz Costa-Jussa, Universitat Politècnica de Catalunya

Bio

Marta R. Costa-Jussà is a Ramon y Cajal Researcher at the Universitat Politècnica de Catalunya (UPC, Barcelona). She received her PhD from the UPC in 2008. Her research experience is mainly in Machine Translation. She has worked at LIMSI-CNRS (Paris), Barcelona Media Innovation Center, Universidade de São Paulo, Institute for Infocomm Research (Singapore), Instituto Politécnico Nacional (Mexico) and the University of Edinburgh. Recently, she has received an ERC Starting Grant 2020 and two Google Faculty Research Awards (2018 and 2019).

Abstract

Multilingual Machine Translation is at the core of social communication. In everyday situations, we rely on free commercial services. These systems have improved their quality thanks to the use of deep learning techniques. Despite the considerable progress that machine translation is making, why do we still see that translation quality is much better in English to Portuguese than between spoken Dutch and Catalan? In addition to this, there are demographic biases widely affecting our systems e.g., from poorer speech recognition for women than for men to stereotyped translations, why neutral words as “doctor” tend to infer the “male” gender when translated into a language that requires gender flexion for this word?

In this talk, we will give some profound insights into (spoken) multilingual machine translation pursuing similar quality for all languages and allowing for incremental addition of new languages. Moreover, we will give details on the fairness challenge, focusing on producing multilingual balanced data in terms of gender; working towards transparency; and debiasing algorithms.

Video

Colloquium PR[AI]RIE

MCMC, Variational Inference, Invertible Flows… Bridging the gap?

16/09/2020

11h

Speaker: Éric Moulines, École Polytechnique

Abstract

Variational Autoencoders (VAE) — generative models combining variational inference and autoencoding — have found widespread applications to learn latent representations for high-dimensional observations. However, most VAEs, relying on simple mean-field variational distributions, usually suffer from somewhat limited expressiveness, which results in a poor approximation of the conditional latent distribution and in particular mode dropping. In this work, we propose Metropolized VAE (MetVAE), a VAE approach based on a new class of variational distributions enriched with Markov Chain Monte Carlo. We develop a specific instance of MetVAE with Hamiltonian Monte Carlo and demonstrate clear improvements of the latent distribution approximations at the cost of a moderate increase of the computational cost. We consider application to probabilistic collaborative filtering models, and numerical experiments on classical benchmarks support the performance of MetVAE.

Video

Colloquium PR[AI]RIE

Unsupervised learning of sounds and words: Is it easier from child-directed speech?

14h

Speaker: Alex Cristia, Laboratoire de Sciences Cognitives et Psycholinguistique, Département d’études cognitives ENS, EHESS, Centre National de la Recherche Scientifique PSL Research University

Abstract

Developments in recent years have sometimes led to systems that can achieve super-human performance even in tasks previously thought to require human cognition. As of today, however, humans remain simply unsurpassable in the domain native language acquisition. Children routinely become fluent in one or more languages by about 4 years of age, after exposure to possibly as little as 500h, and maximally 8k hours of speech. In stark contrast, the best speech recognition and natural language processing systems on the market today require up to 100 times those quantities of input to achieve a level of performance that is substantially lower than that of humans, often having to employ at least some labeled data. It has been argued that infants’ acquisition is aided by cooperative tutors: Child-directed speech may be simplified in ways that boost learning. In this talk, I present results from several studies assessing the learnability of speech sounds and words from child- versus adult-directed speech. I demonstrate that learnability is increased in input to children only when we assume the learner has access to representations that abstract from the acoustic signal; when presented with acoustic speech features, however, learnability is lower for child- than adult-directed speech. These results suggest present-day machines are unlikely to benefit from infant-directed input, unless we improve our acoustic representations of speech.

Video

Colloquium PR[AI]RIE

L’éclairage de la physique statistique sur quelques questions d’apprentissage machine

06/05/2020

Speaker: Marc Mézard, Ecole normale supérieure – Université PS

Abstract

Depuis plus de trente ans, il y a eu un certain nombre de tentatives pour utiliser des concepts et méthodes de physique statistique afin de développer un cadre théorique pour l’apprentissage machine, avec des succès mitigés. Cette direction de recherche a été revivifiée récemment, autour des questions ouvertes importantes posées dans le cadre des développements récents du « deep learning », notamment des questions liées à la dynamique des algorithmes d’apprentissage et à la structure des données.

Cet exposé présentera certains de ces développements récents, dans une perspective globale, en soulignant les forces et les faiblesses de telles approches.

Presentation

Video

Colloquium PR[AI]RIE

Information geometry of Independent Component Analysis

05/02/2020

Speaker: Jean François Cardoso, CNRS et Institut d’Astrophysique de Paris

Abstract

Independent Component Analysis is an exploratory technique which, as its name implies, aims at decomposing a vector of observations into components which are statistically independent (or as independent as possible). It has numerous applications, particularly in neurosciences for extracting brain sources from their observed mixtures collected on the scalp.

ICA goes well beyond PCA (Principal Component Analysis) because statistical independence is a much stronger property than mere decorrelation. Of course, this program implies that an ICA method must use non Gaussian statistics in order to express independence (otherwise, independence would reduce decorrelation).

In this (non technical) seminar, I use a simple construction of Information Geometry (a Pythagorean theorem in distribution space) to elucidate the connections in ICA between the main players: correlation, independence, non Gaussianity, mutual information and entropy.

Presentation

Colloquium PR[AI]RIE

Why are we still translating sentences?

14/02/2023

15h

Speaker: Matt Post, Microsoft

Bio

Matt Post is a research scientist working in the Microsoft Translator group, where he has been since 2021. He holds a courtesy appointment in the department of computer science at Johns Hopkins University, where, prior to joining Microsoft, he worked for ten years or so as a research scientist at the HLTCOE (Human Language Technology Center of Excellence) and with the Center for Language and Speech Processing (CLSP). He is interested mostly in machine translation, but also enjoys working on practical applied problems in many areas within NLP. He has contributed to many open source projects, including Joshua, Sockeye, Fairseq, and sacrebleu. He helped organize the WMT manual evaluation for many years, has served on the NAACL executive board, and is the director of the ACL Anthology.

Abstract

The technology and architectures underlying machine translation have changed a number of times over the decades, but apart from occasional research projects, the basic unit of translation has always been, and remains, the sentence. This paradigm persists despite the many clear advantages of translating at the document level, and it grows more glaring as much of NLP technology moves to large language models, which are natively document-based. This talk will survey research in document translation, highlighting difficulties in training, models, and evaluation. We then propose simple, workable solutions in each of these areas that may help the field escape its sentence-level rut.
Joint work with Marcin Junczys-Dowmunt.

Replay

coming soon

Colloquium PR[AI]RIE

Exploiting Graph Invariants in Deep Learning

19/01/2022

14h

Speaker: Marc Lelarge, Inria

Bio

Dr. Marc Lelarge is a researcher at INRIA. He is also a lecturer in deep learning at Ecole Polytechnique (Palaiseau, France) and Ecole Normale Superieure. He graduated from Ecole Polytechnique, qualified as an engineer at Ecole Nationale Superieure des Telecommunications (Paris) and received a PhD in Applied Mathematics from Ecole Polytechnique in 2005. Recipient of the 2012 SIGMETRICS rising star researcher award and the 2015 Best Publication in Applied Probability Award with Mohsen Bayati and Andrea Montanari for their work on compressed sensing.

Abstract

Geometric deep learning is an attempt for geometric unification of a broad class of machine learning problems from the perspectives of symmetry and invariance. In this talk, I will present some advances of geometric deep learning applied to combinatorial structures. I will focus on various classes of graph neural networks that have been shown to be successful in a wide range of applications with graph structured data.

Colloquium PR[AI]RIE

How AI can help us study the complexity of children’s early language acquisition

16/02/2022

14h

Speaker: Abdellah Fourtassi

Bio

I am currently a researcher (“délégation recherche”) at INRIA Paris, visiting fromAix-Marseille University where I am Assistant Professor (Maître de Conférence) of computer science since late 2019. I am also a research fellow at the Institute of Language, Communication, and the Brain (ILCB) where I direct the interdisciplinary research group “Computational Communication, and Development” (cocodev.fr). Prior to that, I was a postdoctoral research fellow at Stanford University. I completed my PhD at Ecole Normale Supérieure rue d’Ulm and my undergraduate studies at Ecole Polytechnique.

Abstract

To acquire language, children need to learn the form (e.g., phonology and syntax), the content (e.g., word and sentence meanings), and the use (e.g., finding the right words to convey communicative intents). Research in language development has traditionally simplified this process by studying these dimensions separately. The reality of the situation is that children have to deal with aspects of form, content, and use simultaneously. In addition, experimental studies suggest that the timelines of acquisition of these dimensions largely overlap, indicating that children learn them in parallel, not one at a time. While this fact makes language acquisition seem even harder than we previously thought, here I show that the joint learning of form, content, and use may actually be more a help than a hindrance: These dimensions are interdependent in many ways and can therefore constrain/disambiguate each other.

More generally, I argue that research into the complex interaction/synergy across linguistic levels, during child development, requires going beyond traditional research tools in the field of child development (e.g., controlled experiments) and integrating cutting-edge methods from AI in our research toolkit. This new research method is instrumental not only in piercing some lingering mysteries in children’s language learning but also in understanding the development of this complex phenomenon in its natural context (e.g., as opposed to in-lab studies), thus facilitating the translation of scientific findings much more easily into real-life interventions and societal applications.

Colloquium PR[AI]RIE

Data Augmentation in High Dimensional Low Sample Size Setting Usinga Geometry-Based Variational Autoencoder

09/03/2022

14h

Speaker: Stéphanie Allassonnière

Bio

Professor of Mathematics at the School of Medicine, University of Paris and associated Professor in the applied Mathematics department of Ecole Polytechnique. Manager of master programs and masterclasses in AI in healthcare.

Abstract

In this presentation, we propose a new method to perform data augmentation in a reliable way in the High Dimensional Low Sample Size (HDLSS) setting using a geometry-based variational autoencoder. Our approach combines a proper latent space modeling of the VAE seen as a Riemannian manifold with a new generation scheme which produces more meaningful samples especially in the context of small data sets. The proposed method is tested through a wide experimental study where its robustness to data sets, classifiers and training samples size is stressed. It is also validated on a medical imaging classification task on the challenging ADNI database where a small number of 3D brain MRIs are considered and augmented using the proposed VAE framework. In each case, the proposed method allows for a significant and reliable gain in the classification metrics. For instance, balanced accuracy jumps from 66.3% to 74.3% for a state-of-the-art CNN classifier trained with 50 MRIs of cognitively normal (CN) and 50 Alzheimer disease (AD) patients and from 77.7% to 86.3% when trained with 243 CN and 210 AD while improving greatly sensitivity and specificity metrics.

Colloquium PR[AI]RIE

Application-Driven Geometric Machine Learning

06/04/2022

15h

Speaker: Justin Solomon

Bio

Justin Solomon is an associate professor of Electrical Engineering and Computer Science in the MIT Computer Science and Artificial Intelligence Laboratory. He runs the MIT Geometric Data Processing group, which studies problems at the intersection of geometry, large-scale optimization, and applications in machine learning, graphics, and vision.

Abstract

From 3D modeling to autonomous driving, a variety of applications can benefit from data-driven reasoning about geometric problems. The available data and preferred shape representation, however, varies widely from one application to the next. Indeed, the one commonality among most of these settings is that they are not easily approached using data-driven methods that have become de rigueur in other branches of computer vision and machine learning. In this talk, I will summarize recent efforts in my group to develop learning architectures and methodologies paired to specific applications, from point cloud processing to mesh and implicit surface modeling. In each case, we will see how mathematical structures and application-specific demands drive our design of the learning methodology, rather than bending application demands or ignoring geometric details to apply a standard data analysis technique.

Colloquium PR[AI]RIE

Representing non-negative functions, with applications to non-convex optimization and beyond

10/05/2022

14h

Speaker: Alessandro Rudi

Bio

Alessandro Rudi is Researcher at INRIA, Paris from 2017. He received his PhD in 2014 from the University of Genova, after being a visiting student at the Center for Biological and Computational Learning at Massachusetts Institute of Technology. Between 2014 and 2017 he has been a postdoctoral fellow at Laboratory of Computational and Statistical Learning at Italian Institute of Technology and University of Genova.

Abstract

Many problems in applied mathematics admit a natural representation in terms of non-negative functions, e.g. probability representation and inference, optimal transport, optimal control, non-convex optimization, to name a few. While linear models are well suited to represent functions with output in R or C, being at the same time very expressive and flexible, the situation is different for the case of non-negative functions where the existing models lack one of these good properties.

In this talk we present a model for non-negative functions that promises to bring to these problems, the same benefits that linear models brought to interpolation, approximation, quadrature and supervised learning, leading to a new class of adaptive algorithms with provably fast convergence.

In particular, we will show direct applications in numerical methods for probability representation and non-convex optimization. We will see more in detail that the model allows to derive an algorithm for non-convex optimization that is adaptive to the degree of differentiability of the objective function and achieves optimal rates of convergence. Finally, we show how to apply the same technique to other interesting problems in applied mathematics that can be easily expressed in terms of inequalities.

Ulysse Marteau-Ferey , Francis Bach, Alessandro Rudi. Non-parametric Models for Non-negative Functions. https://arxiv.org/abs/2007.03926
Alessandro Rudi, Ulysse Marteau-Ferey, Francis Bach. Finding Global Minima via Kernel Approximations. https://arxiv.org/abs/2012.11978
Alessandro Rudi, Carlo Ciliberto. PSD Representations for Effective Probability Models. https://arxiv.org/pdf/2106.16116.pdf

Colloquium PR[AI]RIE

Patient phenotypic similarity for diagnosis of rare diseases

22/06/2022

14h

Speaker: Xiaoyi CHEN

Bio

Xiaoyi Chen is researcher at Institut Imagine, a research institute specialized in genetic diseases. Her research focuses on automated methods to identify rare disease patients in huge real-world-data repositories. She received her PhD in applied mathematics and computational biology at Institut Pasteur, Paris (2015). Between 2016 and 2022, she was a researcher in the Information Sciences to support Personalized Medicine group at Inserm UMR 1138 (now team HeKA Inria-Inserm-Université Paris Cité).

Abstract

Many rare diseases suffer from important delayed- or underdiagnosis issues due to a broad spectrum of phenotypes and high genetic and clinical heterogeneity. One solution to accelerate the diagnosis process is to rely on patients’ electronic health records (EHRs) for automatic phenotyping and develop algorithms to identify from large scale clinical data warehouse patients having similar profiles to those from already diagnosed patients.
In this talk, I will summarize recent efforts in the context of RHU C’IL-LICO project, to develop diagnosis support systems that takes into consideration the semantic relations between clinical concepts and the different levels of relevance presented in patients’ EHRs – including incompleteness, inaccurate phenotyping, noisy phenotypes related to multiple comorbidities and medical histories, as well as the clinical heterogeneity of complex rare diseases and the important imbalance issues.

Colloquium PR[AI]RIE

Information theory through kernel method

06/09/2022

14h

Speaker: Francis Bach

Bio

Researcher at Inria, leading since 2011 the machine learning team which is part of the Computer Science department at Ecole Normale Supérieure. Ph.D. Berkeley (2005). ERC Starting grant (2009) and Consolidator Grant (2016), Inria young researcher prize (2012), ICML test-of-time award (2014), Lagrange prize in continuous optimization (2018). Co-editor-in-chief of the Journal of Machine Learning Research. Member of the Academy of Sciences.

Abstract

Estimating and computing entropies of probability distributions are key computational tasks throughout data science. In many situations, the underlying distributions are only known through the expectation of some feature vectors, which has led to a series of works within kernel methods. In this talk, I will explore the particular situation where the feature vector is a rank-one positive definite matrix, and show how the associated expectations (a covariance matrix) can be used with information divergences from quantum information theory to draw direct links with the classical notions of Shannon entropies.

Colloquium PR[AI]RIE

Recent advances in robust machine learning

27/09/2022

11h

Hybrid format:

Conference room of the Centre Sciences des Données, 45 rue d’Ulm, 3rd floor between stairs B&C
Connection link

Speaker: Masashi Sugiyama, RIKEN/The University of Tokyo

Bio

Masashi Sugiyama received a Ph.D. in Computer Science from Tokyo Institute of Technology in 2001. He has been a Professor at the University of Tokyo since 2014 and concurrently Director of the RIKEN Center for Advanced Intelligence Project (AIP) since 2016. His research interests include theories and algorithms of machine learning. In 2022, he received the Award for Science and Technology from Japan’s Minister of Education, Culture, Sports, Science, and Technology. He served as Program Co-chairs for Neural Information Processing Systems (NeurIPS) Conference in 2015, International Conference on Artificial Intelligence and Statistics (AISTATS) in 2019, and Asian Conference on Machine Learning (ACML) in 2010 and 2020. He (co)authored Machine Learning in Non-Stationary Environments (MIT Press, 2012), Density Ratio Estimation in Machine Learning (Cambridge University Press, 2012), Statistical Reinforcement Learning (Chapman & Hall, 2015), Introduction to Statistical Machine Learning (Morgan Kaufmann, 2015), and Machine Learning from Weak Supervision (MIT Press, 2022).

Abstract

When machine learning systems are trained and deployed in the real world, we face various types of uncertainty. For example, training data at hand may contain insufficient information, label noise, and bias. In this talk, I will give an overview of our recent advances in robust machine learning, including weakly supervised classification (positive-unlabeled classification, positive-confidence classification, complementary-label classification, etc), noisy label learning (noise transition estimation, instance-dependent noise, clean sample selection, etc.), and domain adaptation (joint importance-predictor learning for covariate shift adaptation, dynamic importance-predictor learning for full distribution shift, etc.).

Colloquium PR[AI]RIE

Artificial Intelligence and Society: What would a better AI mean?

14h

Speaker: Thierry Poibeau, CNRS

Bio

CNRS Research Director, Head of the CNRS Lattice research unit (2012-2018) and adjunct head since 2019. Affiliated lecturer, Language Technology Laboratory, U. of Cambridge since 2009. Rutherford fellowship, Turing institute, London, 2018-2019. Teaching NLP in the PSL Master in Digital Humanities.

Abstract

Artificial Intelligence (AI) has made huge progress in the last few years. Applications are now deployed and have a real impact on society. The press regularly echoes concerns, from the general public as well as from professionals and even researchers themselves: if AI has achieved human-like performance on various tasks, should we fear the consequences? For example, the production of ‘fake news’ and ‘deep fake’ on a large scale can be a danger for democracy. If language models reflect or even amplify the biases of the training data, there is a risk of discrimination. etc.

In this presentation, we will come back to these thorny and topical questions. We will remind some well-known cases, which have made the headlines, where AI has been called into question in various ways. It seems pretty clear that some scandals could have been avoided and were due to problematic deployment of poorly developed systems. However, beyond that, we will show that the issues raised are complex: the notion of bias, for example, implies the idea of a norm. Who sets the standard? And, if unbiasing the models seems a laudable goal in itself, who could decide what a neutral, unbiased model would be? The notion of human or superhuman performance (which suggests a risk of loss of control of the human against the machine) must also be questioned: we still seem far from a general, autonomous AI, able to take power against humans.

In the end, our position is close to that of Kate Crawford: AI is too often described as an autonomous force, whereas it is made by humans, for humans, with specific interests that have to be unraveled. It is also clear that we, as researchers, have our responsibilities too and we cannot hide behind the supposed neutrality of technology. A better account of what the technology can do, and cannot do, would help raise the debate on these important questions.

Colloquium PR[AI]RIE

Quantitative Uniform Stability of the Iterative Proportional Fitting Procedure

12/12/2022

Speaker: George Deligiannidis, University of Oxford

Bio

After obtaining my PhD from the School of Mathematical Sciences of the University of Nottingham under the supervision of Sergey Utev and Huiling Le, I moved to the Department of Mathematics of the University of Leicester as a Teaching Assistant/Fellow. In 2012 I moved to the Department of Statistics of the University of Oxford as Departmental Lecturer. I stayed in Oxford until September 2016 when I moved to the Department of Mathematics of King’s College London as Lecturer in Statistics. I moved back to the University of Oxford in December 2017 as Associate Professor of Statistics

Abstract

We establish the uniform in time stability, w.r.t. the marginals, of the Iterative Propor- tional Fitting Procedure, also known as Sinkhorn algorithm, used to solve entropy-regularised Optimal Transport problems. Our result is quantitative and stated in terms of the 1- Wasserstein metric. As a corollary we establish a quantitative stability result for Schrödinger bridges.

This is joint work with V. de Bortoli and A. Doucet.

Insights in (Spoken) Multilingual Machine Translation: examining Continuous Learning and Fairness

Bio

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

MCMC, Variational Inference, Invertible Flows… Bridging the gap?

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Unsupervised learning of sounds and words: Is it easier from child-directed speech?

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

L’éclairage de la physique statistique sur quelques questions d’apprentissage machine

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Information geometry of Independent Component Analysis

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Why are we still translating sentences?

Bio

Abstract

Replay

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Exploiting Graph Invariants in Deep Learning

Bio

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

How AI can help us study the complexity of children’s early language acquisition

Bio

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Data Augmentation in High Dimensional Low Sample Size Setting Usinga Geometry-Based Variational Autoencoder

Bio

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Application-Driven Geometric Machine Learning

Bio

Abstract

Past related events

Multimodal modeling of tumors

Planetary-Scale Sequencing Data Analysis Surveys Life’s Diversity

Computational social choice: ten little talks (a tribute to Agatha Christie)

Modelling and controlling biological systems: restricted Boltzmann machines revisited

Representing non-negative functions, with applications to non-convex optimization and beyond

Bio

Abstract