To receive PRAIRIE news and colloquium announcements sign up for PRAIRIE mailing list.
(In case this automatic link does not work, send an email to firstname.lastname@example.org with the subject: subscribe prairie_news [your e-mail address].)
6th Prairie seminar – 13 January 2021, at 11h CET (webinar)
From telephone: +33 170 950 590, access code: 888-062-845
Video conferencing system: 18.104.22.168 ou inroomlink.goto.com ID réunion: 888 062 845 Or call directly: email@example.com or 22.214.171.124##888062845
Speaker: Béatrice Joyeux-Prunel, University of Geneva
Title: Data Science applied to Visual Globalization. The project Visual Contagions
=== Bio ===
Béatrice Joyeux-Prunel is full Professor at the University of Geneva in Switzerland, as chair for Digital Humanities (dh.unige.ch). She leads the FNS Project Visual Contagions; and the IMAGO Centre at the École Normale Supérieure, a center dedicated to teaching, research and creation on the circulation of images in Europe (www.imago.ens.fr). In 2009, Joyeux-Prunel founded Artl@s, a platform that federates several international research projects on the globalization of art and images and digital approaches. She works on the social and global history of modern art, on visual globalization, the Digital Humanities, and the visual history of petroleum. Among her publications : Béatrice Joyeux-Prunel (ed.) with the collaboration of Luc Sigalo-Santos, L’art et la mesure. Histoire de l’art et méthodes quantitatives: sources, outils, bonnes pratiques (ed. Rue d’Ulm, 2010); Catherine Dossin, Thomas DaCosta Kaufmann, and Béatrice Joyeux-Prunel (ed.), Circulations in the Global History of a Art (Routledge, 2015). And as sole author : Les avant-gardes artistiques – une histoire transnationale 1848-1918 (Gallimard Folio histoire pocket series, 2016) ; Les avant-gardes artistiques – une histoire transnationale 1918-1945 (Gallimard Folio histoire pocket series, 2017) ; and Naissance de l’art contemporain (1945-1970) – Une histoire mondiale (Editions du CNRS, 2021).
==== Résumé ====
Images are the somewhat sickly child of globalization studies. We know that they have conveyed and still convey behavioural models, representations and values that participate in the cultural homogenization by which globalization is most often identified. But we are quite incapable of explaining how this homogenization has taken place; which images have circulated or been imitated the most in the past; according to which social, cultural, geographic channels; what were their success factors; and whether there has been more homogenization than fabrication of heterogeneity in the global circulation of images.
Data science can be very useful in trying to answer these questions, or at least to provide hypotheses about image-based globalization. The Visual Contagion project (Swiss National Science Foundation, 2021-2025) and the Imago Center (Label ERC European Center of Excellence Jean Monnet, ENS/Beaux-Arts de Paris, 2019-2022) are interested in these questions. The particular case of the age of the illustrated print makes it possible to study the matter over the long period (1890-1990), and on a global scale, since we have an unprecedented quantity of digitized sources. What remains is to establish a workflow that would be as relevant as possible – which brings decisive issues for the digital humanities: how to organize the infrastructure for hosting and retrieving our images, so as not to re-host data already made available by others? How can we minimize the computing time of our algorithms? Can we envisage pattern descriptions that are interoperable and can be exchanged between projects that apply the same pattern recognition methods? Once the images have been described, how can we visualize their circulation in time, space, social and cultural environments? What interpretations can then be made of the results obtained?
5th Prairie seminar – 18 November 2020, at 11h CET (webinar)
Speaker: Marta Ruiz Costa-Jussa, Universitat Politècnica de Catalunya
Title: Insights in (Spoken) Multilingual Machine Translation: examining Continuous Learning and Fairness.
=== Bio ===
Marta R. Costa-Jussà is a Ramon y Cajal Researcher at the Universitat Politècnica de Catalunya (UPC, Barcelona). She received her PhD from the UPC in 2008. Her research experience is mainly in Machine Translation. She has worked at LIMSI-CNRS (Paris), Barcelona Media Innovation Center, Universidade de São Paulo, Institute for Infocomm Research (Singapore), Instituto Politécnico Nacional (Mexico) and the University of Edinburgh. Recently, she has received an ERC Starting Grant 2020 and two Google Faculty Research Awards (2018 and 2019).
==== Résumé ====
Multilingual Machine Translation is at the core of social communication. In everyday situations, we rely on free commercial services. These systems have improved their quality thanks to the use of deep learning techniques. Despite the considerable progress that machine translation is making, why do we still see that translation quality is much better in English to Portuguese than between spoken Dutch and Catalan? In addition to this, there are demographic biases widely affecting our systems e.g., from poorer speech recognition for women than for men to stereotyped translations, why neutral words as “doctor” tend to infer the “male” gender when translated into a language that requires gender flexion for this word?
In this talk, we will give some profound insights into (spoken) multilingual machine translation pursuing similar quality for all languages and allowing for incremental addition of new languages. Moreover, we will give details on the fairness challenge, focusing on producing multilingual balanced data in terms of gender; working towards transparency; and debiasing algorithms.
4th Prairie seminar – 16 September 2020, at 11h CET (webinar)
Speaker: Éric Moulines, École Polytechnique
Title: “MCMC, Variational Inference, Invertible Flows… Bridging the gap?”
==== Résumé ====
Variational Autoencoders (VAE) — generative models combining variational inference and autoencoding — have found widespread applications to learn latent representations for high-dimensional observations. However, most VAEs, relying on simple mean-field variational distributions, usually suffer from somewhat limited expressiveness, which results in a poor approximation of the conditional latent distribution and in particular mode dropping. In this work, we propose Metropolized VAE (MetVAE), a VAE approach based on a new class of variational distributions enriched with Markov Chain Monte Carlo. We develop a specific instance of MetVAE with Hamiltonian Monte Carlo and demonstrate clear improvements of the latent distribution approximations at the cost of a moderate increase of the computational cost. We consider application to probabilistic collaborative filtering models, and numerical experiments on classical benchmarks support the performance of MetVAE.
3rd Prairie seminar – 9 June 2020, at 14h CET (webinar)
Speaker: Alex Cristia, Laboratoire de Sciences Cognitives et Psycholinguistique, Département d’études cognitives ENS, EHESS, Centre National de la Recherche Scientifique PSL Research University, https://sites.google.com/site/acrsta/
Title: « Unsupervised learning of sounds and words: Is it easier from child-directed speech? »
==== Résumé ====
Developments in recent years have sometimes led to systems that can achieve super-human performance even in tasks previously thought to require human cognition. As of today, however, humans remain simply unsurpassable in the domain native language acquisition. Children routinely become fluent in one or more languages by about 4 years of age, after exposure to possibly as little as 500h, and maximally 8k hours of speech. In stark contrast, the best speech recognition and natural language processing systems on the market today require up to 100 times those quantities of input to achieve a level of performance that is substantially lower than that of humans, often having to employ at least some labeled data. It has been argued that infants’ acquisition is aided by cooperative tutors: Child-directed speech may be simplified in ways that boost learning. In this talk, I present results from several studies assessing the learnability of speech sounds and words from child- versus adult-directed speech. I demonstrate that learnability is increased in input to children only when we assume the learner has access to representations that abstract from the acoustic signal; when presented with acoustic speech features, however, learnability is lower for child- than adult-directed speech. These results suggest present-day machines are unlikely to benefit from infant-directed input, unless we improve our acoustic representations of speech.
2nd Prairie seminar – 6 May 2020 (webinar)
Speaker: Marc Mézard, Ecole normale supérieure – Université PSL
Title: « L’éclairage de la physique statistique sur quelques questions d’apprentissage machine »
==== Résumé ====
Depuis plus de trente ans, il y a eu un certain nombre de tentatives pour utiliser des concepts et méthodes de physique statistique afin de développer un cadre théorique pour l’apprentissage machine, avec des succès mitigés. Cette direction de recherche a été revivifiée récemment, autour des questions ouvertes importantes posées dans le cadre des développements récents du « deep learning », notamment des questions liées à la dynamique des algorithmes d’apprentissage et à la structure des données.
Cet exposé présentera certains de ces développements récents, dans une perspective globale, en soulignant les forces et les faiblesses de telles approches.
1st Prairie seminar – 5 February 2020
Speaker: Jean François Cardoso, CNRS et Institut d’Astrophysique de Paris (http://www2.iap.fr/users/cardoso/)
Title: « Information geometry of Independent Component Analysis »
==== Résumé ====
Independent Component Analysis is an exploratory technique which, as its name implies, aims at decomposing a vector of observations into components which are statistically independent (or as independent as possible). It has numerous applications, particularly in neurosciences for extracting brain sources from their observed mixtures collected on the scalp.
ICA goes well beyond PCA (Principal Component Analysis) because statistical independence is a much stronger property than mere decorrelation. Of course, this program implies that an ICA method must use non Gaussian statistics in order to express independence (otherwise, independence would reduce decorrelation).
In this (non technical) seminar, I use a simple construction of Information Geometry (a Pythagorean theorem in distribution space) to elucidate the connections in ICA between the main players: correlation, independence, non Gaussianity, mutual information and entropy.