Unsupervised learning of sounds and words: Is it easier from child-directed speech?
Speaker: Alex Cristia, Laboratoire de Sciences Cognitives et Psycholinguistique, Département d’études cognitives ENS, EHESS, Centre National de la Recherche Scientifique PSL Research University
Abstract
Developments in recent years have sometimes led to systems that can achieve super-human performance even in tasks previously thought to require human cognition. As of today, however, humans remain simply unsurpassable in the domain native language acquisition. Children routinely become fluent in one or more languages by about 4 years of age, after exposure to possibly as little as 500h, and maximally 8k hours of speech. In stark contrast, the best speech recognition and natural language processing systems on the market today require up to 100 times those quantities of input to achieve a level of performance that is substantially lower than that of humans, often having to employ at least some labeled data. It has been argued that infants’ acquisition is aided by cooperative tutors: Child-directed speech may be simplified in ways that boost learning. In this talk, I present results from several studies assessing the learnability of speech sounds and words from child- versus adult-directed speech. I demonstrate that learnability is increased in input to children only when we assume the learner has access to representations that abstract from the acoustic signal; when presented with acoustic speech features, however, learnability is lower for child- than adult-directed speech. These results suggest present-day machines are unlikely to benefit from infant-directed input, unless we improve our acoustic representations of speech.