SAGOT Benoît

Natural Language Processing

benoit.sagot [at] inria.fr

Benoit Sagot

Short bio

Researcher at Inria, head of the ALPAGE (2014-2016) and ALMAnaCH (2017-) teams. Co-founder of the Verbatim Analysis (2009-) and opensquare (2016-) Inria start-ups.

Topics of interest

Computational linguistics, Natural Language Processing (NLP), NLP applications.

Project in Prairie

Benoît Sagot will focus on improving and better understanding neural approaches to NLP and integrating linguistic and extra-linguistic contextual information. He will study how non-neural approaches and language resources can contribute to improving neural NLP systems in low-resource and non-edited scenarios. Applications, both academic and industrial, will include computational linguistics and sociolinguistics, opinion mining in survey results, NLP for financial and historical documents, and text simplification to help people with disabilities.

Quote

Most current research in NLP focuses on neural architectures that rely on
large volumes of data, in the form of both raw text and costly annotated corpora. The increasing amount of data necessary to train such models is not available for all languages and can require massive computational resources. Moreover, these approaches are highly sensitive to language variation, illustrated for instance by domain-specific texts, historical documents and non-edited content as found on social media. To address these issues and allow for a wider deployment of NLP technologies, this bottleneck must be overcome. This will require new models that better exploit the complex structure of language and the context in which it is used.

POIBEAU Thierry

Natural Language Processing, Digital Humanities

thierry.poibeau [at] ens.fr

Thierry Poibeau

Short bio

CNRS Research Director, Head of the CNRS Lattice research unit (2012-2018) and adjunct head since 2019. Affiliated lecturer, Language Technology Laboratory, U. of Cambridge since 2009. Rutherford fellowship, Turing institute, London, 2018-2019. Teaching NLP in the PSL Master in Digital Humanities.

Topics of interest

Computational linguistics, Low resource languages, Corpora, Distant reading, AI and creativity

Project in Prairie

Thierry Poibeau’s work focuses on Natural Language Processing. He is especially interested in developing techniques for low resource languages that have largely been left out of the machine learning revolution. He is also interested in applying AI techniques to the study of literature and social sciences, shedding new light on the notions of culture and creativity.

Quote

Natural Language Processing (NLP) has made considerable progress over the last few years, mainly due to impressive advances in machine learning. We have now efficient and accurate tools for 20+ languages, but the vast majority of the world languages lack the resources for state-of-the-art NLP. This is a major challenge for our field, since preserving language and cultural diversity is as important as preserving bio-diversity. Technology is not the only solution, but it helps facilitate this process by leveraging resources, bridging the gap between languages, and enhancing our understanding of culture and society.