Neurophysiological indices of audiovisual speech integration are enhanced at the phonetic level for speech in noise (bibtex)

by O'Sullivan, Aisling E., Crosse, Michael J., Di Liberto, Giovanni M, de Cheveigné, Alain and Lalor, Edmund C

Abstract:

Seeing a speaker\textquoterights face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker\textquoterights face provides temporal cues to auditory cortex, and articulatory information from the speaker\textquoterights mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However it remains unclear how the integration of these cues varies as a function of listening conditions. Here we sought to provide insight on these questions by examining EEG responses to natural audiovisual, audio, and visual speech in quiet and in noise. Specifically, we represented our speech stimuli in terms of their spectrograms and their phonetic features, and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis. The encoding of both spectrotemporal and phonetic features was shown to be more robust in audiovisual speech responses then what would have been expected from the summation of the audio and visual speech responses, consistent with the literature on multisensory integration. Furthermore, the strength of this multisensory enhancement was more pronounced at the level of phonetic processing for speech in noise relative to speech in quiet, indicating that listeners rely more on articulatory details from visual speech in challenging listening conditions. These findings support the notion that the integration of audio and visual speech is a flexible, multistage process that adapts to optimize comprehension based on the current listening conditions.Competing Interest StatementThe authors have declared no competing interest.

Reference:

O'Sullivan, Aisling E., Crosse, Michael J., Di Liberto, Giovanni M, de Cheveigné, Alain and Lalor, Edmund C (2021). Neurophysiological indices of audiovisual speech integration are enhanced at the phonetic level for speech in noise. Journal of Neuroscience, 41, 4991-5003.

Bibtex Entry:

@article{OSullivan_2021,
	abstract = {Seeing a speaker{\textquoteright}s face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker{\textquoteright}s face provides temporal cues to auditory cortex, and articulatory information from the speaker{\textquoteright}s mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However it remains unclear how the integration of these cues varies as a function of listening conditions. Here we sought to provide insight on these questions by examining EEG responses to natural audiovisual, audio, and visual speech in quiet and in noise. Specifically, we represented our speech stimuli in terms of their spectrograms and their phonetic features, and then quantified the strength of the encoding of those features in the EEG using canonical correlation analysis. The encoding of both spectrotemporal and phonetic features was shown to be more robust in audiovisual speech responses then what would have been expected from the summation of the audio and visual speech responses, consistent with the literature on multisensory integration. Furthermore, the strength of this multisensory enhancement was more pronounced at the level of phonetic processing for speech in noise relative to speech in quiet, indicating that listeners rely more on articulatory details from visual speech in challenging listening conditions. These findings support the notion that the integration of audio and visual speech is a flexible, multistage process that adapts to optimize comprehension based on the current listening conditions.Competing Interest StatementThe authors have declared no competing interest.},
	author = {O'Sullivan, Aisling E. and Crosse, Michael J. and Di Liberto, Giovanni M and de Cheveign{\'e}, Alain and Lalor, Edmund C},
	date-added = {2020-04-21 07:27:22 +0100},
	date-modified = {2021-06-12 16:19:50 +0100},
	doi = {10.1523/JNEUROSCI.0906-20.2021},
	journal = {Journal of Neuroscience},
	pages = {4991-5003},
	title = {Neurophysiological indices of audiovisual speech integration are enhanced at the phonetic level for speech in noise},
	volume = {41},
	year = {2021}}