The Influence of Perceptual Categories on Auditory Feedback Control During Speech

C Niziolek1, F Guenther1,2
1Harvard-MIT Division of Health Sciences and Technology, Cambridge, MA, United States/2Boston University, Boston, MA, United States


Introduction: Speakers use auditory feedback to monitor their own speech, ensuring that the observed output matches the intended output.  By altering the speech feedback signal before it reaches the ear, we can induce perceived errors and observe both the acoustic and neural consequences.  This study investigates the neural mechanisms responsible for the detection and consequent correction of auditory errors, as well as the influence of phonetic categories on this auditory feedback control.  Because subjects’ perceptual boundaries for vowels are often asymmetric around the vowel production target, within- and across-category perturbations of the same magnitude can be contrasted within a single subject.  Our aim is to compare the neural activation due to a phonetic change — for example, from the word "bed" to the word "bad" — to that due to a non-linguistic auditory change — for example, from a prototypical example of "bed" to an altered version of the same word.

Methods: For each subject, vowel production data were collected for six English vowels.  Perceptual boundaries between pairs of vowels were assayed using a word categorization task with formant-shifted stimuli.  Subjects assigned phonetic labels to vowel tokens generated from their own productions, shifted in the first and second formant frequencies (F1 and F2).  Functional magnetic resonance imaging was then employed to measure neural responses to subjects’ speech with and without auditory perturbation.  During each trial, subjects were presented with a word to read aloud or with a control stimulus.  On one out of every four trials, F1 and F2 were perturbed in real time in one of two directions, one which resulted in a perceptual category change and one which did not.  This altered speech was fed back to subjects through headphones, creating a sudden, unexpected mismatch between the vowel target and the perceived realization.

Results: Psychophysical results showed a compensatory shift of the first two formants during the perturbed conditions.  fMRI data for two subjects (see Figure 1) showed increased activation of bilateral inferior frontal gyrus, superior temporal gyrus, and supplementary motor areas for across-category shifts compared with within-category shifts.  In accordance with previous results, shifted conditions showed more cortical activation in superior temporal gyrus and right inferior frontal gyrus than unshifted conditions.  In general, cortical activation was greater in extent for shifts that crossed a category boundary than for those that did not, even though these shifts were of the same magnitude.

Conclusions: Vowel category boundaries, as assayed by a perceptual categorization test, are often asymmetric within subjects.  This asymmetry enables a direct contrast between across- and within-category perturbations in a single subject.  That is, a constant shift magnitude can elicit different phonetic percepts, depending on the direction of the shift and the location of category boundaries in formant space.  A within-category shift was found to activate bilateral superior temporal gyrus and right inferior frontal gyrus, while a cross-category shift evoked greater activation in superior temporal gyrus and inferior frontal gyrus bilaterally.

References:
Houde, J.F. (1998), 'Sensorimotor adaptation in speech production', Science, vol. 279, no. 5354, pp. 1213-1216.
Tourville, J.A. (2008), 'Neural mechanisms underlying auditory feedback control of speech', NeuroImage, vol. 39, no. 3, pp. 1429-1443.