Sensorimotor Foundations of Speech Perception in Infancy: Q&A with Dr. Janet Werker & Dr. Henny Yeung

September 10, 2023

The perceptual system for speech is highly organized from early infancy. This organization bootstraps young human learners’ ability to acquire their native speech and language from speech input.

In the study, 'Sensorimotor Foundations of Speech Perception in Infancy,' researchers reviewed behavioural and neuroimaging evidence that perceptual systems beyond the auditory modality are specialized for speech in infancy, and that motor and sensorimotor systems can influence speech perception in young infants, even too young to produce speech-like vocalizations.

In this Q&A with Language Sciences, Dr. Janet Werker, University Killam Professor and Canada Research Chair in the Department of Psychology at the University of British Columbia, and Dr. Henny Yeung, Associate Professor of Linguistics and Cognitive Sciences at Simon Fraser University, explore and explain how infant babbling is essential to speech production, the process of the palate's development in utero, and what's next for their research!

What is considered as a ’speech sound’ and how are speech sounds generated? What are ‘speech movements’ and what do they produce?

When we refer to a speech sound, we typically are talking about the units of spoken language: We typically mean individual consonants and vowels that make up the syllables that we produce when talking. While speaking is something that is essential to spoken language, and comes naturally to fluent speakers, the gestures and neuromuscular commands behind these syllables are quite complex. We are referring to how one controls air pressure buildup, its release controlled by the larynx, and other movements by the jaw, tongue and lips as air moves through the mouth (upper vocal tract). The combined involvement of the vocal cords, jaw, tongue, and lips (as well as the movements of the body that accompany these speech movements) are what create speech sounds.

Why are speech-like vocalizations in infants such as babbling and protophones essential to speech production? 

Because speech production is extraordinarily complex, learning how to coordinate all of these parts of the body requires development and practice. Protophones are pre-speech vocalizations that are mostly recognized for their laryngeal (vocal cord) control, which gives it their resonant “vowel-like” qualities, but are not typically recognizable as speech due to poorer control of the upper part of the vocal tract, like the lips, tongue and jaw. Babbling involves those upper parts as well, and are further characterized by a regular opening and constriction of air that makes syllables, like ‘ma’, or like ‘da.’ Moreover, babbling can be reduplicated when a child can produce a string of the same consonant-vowel syllable – e.g. ba-ba-ba – and non-reduplicated babbling is when babies can change these syllables – e.g. ba-da-ba-da.

Can you expand on how spontaneous oral movements in utero support the development of the palate and later swallowing and breathing?

One type of biological process in human development that we often call “experience-expectant” processes, we all develop in some way, but this requires a common experience that happens in almost all humans. One example is tongue thrusting, which is common in prenatal development: The roof of the mouth, or the palate, would not develop properly without the tongue thrusting against it. But also, the baby would not be able to suckle properly after birth without having exercised the togue in utero. Another example is fetal breathing, where the fetus breathes the amniotic fluid in and out – usually not swallowing it (although it can happen near the end of pregnancy). If you think about a baby sucking on a nipple, they must continue to breathe while sucking and swallowing – otherwise they would suffocate. These spontaneous movements that babies do prenatally allows for them to develop motor coordination to work out breathing while sucking.

How might the sensorimotor mappings for vocal tract articulators available to infants prior to their own production of syllables, interface with the development of their speech and language network?

That’s the central point of the paper! A great analogy is that the brain delivers commands that allow you to reach for and grasp an object that you see with your eyes – you can calculate the distance and the size of the grip just from your eyes – so are motor movement is coordinated with your visual input. Similarly, we had already known that your hearing and vocal motor movements are coordinated in producing and hearing speech. In this paper, we argue that this coordination is established prior to babbling. In other words, when the brain is still developing in early development, there are already connections between the speech motor control areas in the brain and the auditory speech perception areas in the brain. Activation of either area – of moving the articulators or of hearing speech – thus also activates the other through very early connections between these brain areas (i.e., an early version of a neural network). So the brain regions that support oral-motor movements and those that support listing to speech are all part of the same circuit, which later on become part of the ‘speech and language network’ in adults.

Can you explain how the link between perceptual and motor control processes is relevant throughout our lifetimes?

We think that there are several possibilities in terms of the implications for this work. First, we can think of situations when this process does not run smoothly, or when speech processing is more difficult. You might watch someone’s face more at a noisy party than you would need to if sitting next to them on a bench in a quiet park (watching lips to help improve your listening abilities). We would predict that you might activate your motor cortex in those noisy situations as well, and likewise when listening to someone with accented speech or in your second language. And while it is not clear if motor involvement is as necessary in everyday speech perception in good listening conditions, this does seem to be relevant in more challenging situations. Second, we can extend this idea to imagine what everyday speech perception is like in individuals who have or commonly interact in challenging auditory environments: Like children who are learning in very noisy classrooms.

What’s next for this research?

We are interested in exploring how motor, auditory, and also visual influences change across age. As we have said, lots of research has suggested that motor networks are less important in clear listening conditions in adults, but this is less well explored in early development, when listening to anything is a challenge: For instance, when learning to sound out words; when mapping sound to letters, when learning a second language, etc. Eventually, it would be very interesting, and potentially quite important, to explore whether children with oral-motor difficulties have speech perception and later language learning difficulties. Many pervasive developmental disorders include oral motor difficulties (e.g., children with Down’s syndrome often have some challenges controlling tongue movements). Do these contribute to the difficulties that children might experience and could there be interventions that target them?

Written by Kelsea Franzke


First Nations land acknowledegement

We acknowledge that UBC’s campuses are situated within the traditional territories of the Musqueam, Squamish and Tsleil-Waututh, and in the traditional, ancestral, unceded territory of the Syilx Okanagan Nation and their peoples.


UBC Crest The official logo of the University of British Columbia. Urgent Message An exclamation mark in a speech bubble. Caret An arrowhead indicating direction. Arrow An arrow indicating direction. Arrow in Circle An arrow indicating direction. Arrow in Circle An arrow indicating direction. Chats Two speech clouds. Facebook The logo for the Facebook social media service. Information The letter 'i' in a circle. Instagram The logo for the Instagram social media service. External Link An arrow entering a square. Linkedin The logo for the LinkedIn social media service. Location Pin A map location pin. Mail An envelope. Menu Three horizontal lines indicating a menu. Minus A minus sign. Telephone An antique telephone. Plus A plus symbol indicating more or the ability to add. Search A magnifying glass. Twitter The logo for the Twitter social media service. Youtube The logo for the YouTube video sharing service.