[P2 evaluation] Articles

Choisir deux articles dans la liste, provenant de deux intervenants différents. N'oubliez pas de me préciser lequel est pour l'oral et lequel est pour l'écrit. Vu que Barbara Tillmann ne pourra pas assister à l'oral, il serait préférable de choisir ses articles pour l'écrit.


AdC1: Neuroscientist. 2010 Aug;16(4):453-69.

Sensitivity and selectivity of neurons in auditory cortex to the pitch, timbre, and location of sounds.

Bizley JK, Walker KM.

We are able to rapidly recognize and localize the many sounds in our environment. We can describe any of these sounds in terms of various independent "features" such as their loudness, pitch, or position in space. However, we still know surprisingly little about how neurons in the auditory brain, specifically the auditory cortex, might form representations of these perceptual characteristics from the information that the ear provides about sound acoustics. In this article, the authors examine evidence that the auditory cortex is necessary for processing the pitch, timbre, and location of sounds, and document how neurons across multiple auditory cortical fields might represent these as trains of action potentials. They conclude by asking whether neurons in different regions of the auditory cortex might not be simply sensitive to each of these three sound features but whether they might be selective for one of them. The few studies that have examined neural sensitivity to multiple sound attributes provide only limited support for neural selectivity within auditory cortex. Providing an explanation of the neural basis of feature invariance is thus one of the major challenges to sensory neuroscience obtaining the ultimate goal of understanding how neural firing patterns in the brain give rise to perception.


AdC2: Neuroimage. 2010 Jun;51(2):808-16.

The effect of stimulus context on pitch representations in the human auditory cortex.

Garcia D, Hall DA, Plack CJ.

Neuroimaging studies of pitch coding seek to identify pitch-related responses separate from responses to other properties of the stimulus, such as its energy onset, and other general aspects of the listening context. The current study reports the first attempt to evaluate these modulatory influences using functional magnetic resonance imaging (fMRI) measures of cortical pitch representations. Stimulus context was manipulated using a 'classical stimulation paradigm' (whereby successive pitch stimuli were separated by gaps of silence) and a 'continuous stimulation paradigm' (whereby successive pitch stimuli were interspersed with noise to maintain a stable envelope). Pitch responses were measured for two types of pitch-evoking stimuli; a harmonic-complex tone and a complex Huggins pitch. Results for a group of 15 normally hearing listeners revealed that context effects were mostly observed in primary auditory regions, while the most significant pitch responses were localized to posterior nonprimary auditory cortex, specifically planum temporale. Sensitivity to pitch was greater for the continuous stimulation conditions perhaps because they better controlled for concurrent responses to the noise energy onset and reduced the potential problem of a non-linear fMRI response becoming saturated. These results provide support for hierarchical processing within human auditory cortex, with some parts of primary auditory cortex engaged by general auditory energy, some parts of planum temporale specifically responsible for representing pitch information and adjacent regions that are responsible for complex higher-level auditory processing such as representing pitch information as a function of listening context.


AdC3: Curr Opin Neurobiol. 2008 Aug;18(4):452-63.

Music perception, pitch, and the auditory system.

McDermott JH, Oxenham AJ.

The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system. This has been best characterized for those aspects of music that involve pitch. Pitch sequences are heard in terms of relative as well as absolute pitch. Pitch combinations give rise to emergent properties not present in the component notes. In this review we discuss the basic auditory mechanisms contributing to these and other perceptual effects in music.


AdC4: Curr Biol. 2010 Jun 8;20(11):1035-41.

Individual differences reveal the basis of consonance.

McDermott JH, Lehr AJ, Oxenham AJ.

Some combinations of musical notes are consonant (pleasant), whereas others are dissonant (unpleasant), a distinction central to music. Explanations of consonance in terms of acoustics, auditory neuroscience, and enculturation have been debated for centuries. We utilized individual differences to distinguish the candidate theories. We measured preferences for musical chords as well as nonmusical sounds that isolated particular acoustic factors--specifically, the beating and the harmonic relationships between frequency components, two factors that have long been thought to potentially underlie consonance. Listeners preferred stimuli without beats and with harmonic spectra, but across more than 250 subjects, only the preference for harmonic spectra was consistently correlated with preferences for consonant over dissonant chords. Harmonicity preferences were also correlated with the number of years subjects had spent playing a musical instrument, suggesting that exposure to music amplifies preferences for harmonic frequencies because of their musical importance. Harmonic spectra are prominent features of natural sounds, and our results indicate that they also underlie the perception of consonance.


AdC5: J Neurosci. 2010 Oct 6;30(40):13362-6.

Auditory cortical neurons convey maximal stimulus-specific information at their best frequency.

Montgomery N, Wehr M.

Sensory neurons are often thought to encode information about their preferred stimuli. It has also been proposed that neurons convey the most information about stimuli in the flanks of their tuning curves, where firing rate changes most steeply. Here we demonstrate that the responses of rat auditory cortical neurons convey maximal stimulus-specific information about sound frequency at their best frequency, rather than in the flanks of their tuning curves. Theoretical work has shown that stimulus-specific information shifts from tuning curve slope to peak as neuronal variability increases. These results therefore suggest that with respect to the most informative regions of the tuning curve, auditory cortical neurons operate in a regime of high variability.


AdC6: Curr Biol. 2010 Jun 8;20(11):R476-8.

Musical consonance: the importance of harmonicity.

Plack CJ.

A recent study suggests that musical consonance is based on harmonicity, a preference that reflects the central role of harmonicity in auditory perception.


AdC7: Hear Res. 2010 May 10. [Epub ahead of print]

Cortical encoding of pitch: Recent results and open questions.

Walker KM, Bizley JK, King AJ, Schnupp JW.

It is widely appreciated that the key predictor of the pitch of a sound is its periodicity. Neural structures which support pitch perception must therefore be able to reflect the repetition rate of a sound, but this alone is not sufficient. Since pitch is a psychoacoustic property, a putative cortical code for pitch must also be able to account for the relationship between the amount to which a sound is periodic (i.e. its temporal regularity) and the perceived pitch salience, as well as limits in our ability to detect pitch changes or to discriminate rising from falling pitch. Pitch codes must also be robust in the presence of nuisance variables such as loudness or timbre. Here, we review a large body of work on the cortical basis of pitch perception, which illustrates that the distribution of cortical processes that give rise to pitch perception is likely to depend on both the acoustical features and functional relevance of a sound. While previous studies have greatly advanced our understanding, we highlight several open questions regarding the neural basis of pitch perception. These questions can begin to be addressed through a cooperation of investigative efforts across species and experimental techniques, and, critically, by examining the responses of single neurons in behaving animals.


BT1: Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12639-43.

Tuning in to musical rhythms: infants learn more readily than adults.

Hannon EE, Trehub SE.

Domain-general tuning processes may guide the acquisition of perceptual knowledge in infancy. Here, we demonstrate that 12-month-old infants show an adult-like, culture-specific pattern of responding to musical rhythms, in contrast to the culture-general responding that is evident at 6 months of age. Nevertheless, brief exposure to foreign music enables 12-month-olds, but not adults, to perceive rhythmic distinctions in foreign musical contexts. These findings may indicate a sensitive period early in life for acquiring rhythm in particular or socially and biologically important structures more generally.


BT2: J Neurosci. 2009 Aug 19;29(33):10215-20.

Tone deafness: a new disconnection syndrome?

Loui P, Alsop D, Schlaug G.

Communicating with one's environment requires efficient neural interaction between action and perception. Neural substrates of sound perception and production are connected by the arcuate fasciculus (AF). Although AF is known to be involved in language, its roles in non-linguistic functions are unexplored. Here, we show that tone-deaf people, with impaired sound perception and production, have reduced AF connectivity. Diffusion tensor tractography and psychophysics were assessed in tone-deaf individuals and matched controls. Abnormally reduced AF connectivity was observed in the tone deaf. Furthermore, we observed relationships between AF and auditory-motor behavior: superior and inferior AF branches predict psychophysically assessed pitch discrimination and sound production perception abilities, respectively. This neural abnormality suggests that tone deafness leads to a reduction in connectivity resulting in pitch-related impairments. Results support a dual-stream anatomy of sound production and perception implicated in vocal communications. By identifying white matter differences and their psychophysical correlates, results contribute to our understanding of how neural connectivity subserves behavior.


BT3: Cognition. 2008 Feb;106(2):975-83.

Songs as an aid for language acquisition. Schön D, Boyer M, Moreno S, Besson M, Peretz I, Kolinsky R.

In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.] have shown that adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. They also showed that a similar learning mechanism operates with musical stimuli [Saffran, J. R., Johnson, R. E. K., Aslin, N., & Newport, E. L. (1999). Abstract Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.]. In this work we combined linguistic and musical information and we compared language learning based on speech sequences to language learning based on sung sequences. We hypothesized that, compared to speech sequences, a consistent mapping of linguistic and musical information would enhance learning. Results confirmed the hypothesis showing a strong learning facilitation of song compared to speech. Most importantly, the present results show that learning a new language, especially in the first learning phase wherein one needs to segment new words, may largely benefit of the motivational and structuring properties of music in song.


BT4: Psychon Bull Rev. 2009 Apr;16(2):374-81.

Making psycholinguistics musical: self-paced reading time evidence for shared processing of linguistic and musical syntax.

Slevc LR, Rosenberg JC, Patel AD.

Linguistic processing, especially syntactic processing, is often considered a hallmark of human cognition; thus, the domain specificity or domain generality of syntactic processing has attracted considerable debate. The present experiments address this issue by simultaneously manipulating syntactic processing demands in language and music. Participants performed self-paced reading of garden path sentences, in which structurally unexpected words cause temporary syntactic processing difficulty. A musical chord accompanied each sentence segment, with the resulting sequence forming a coherent chord progression. When structurally unexpected words were paired with harmonically unexpected chords, participants showed substantially enhanced garden path effects. No such interaction was observed when the critical words violated semantic expectancy or when the critical chords violated timbral expectancy. These results support a prediction of the shared syntactic integration resource hypothesis (Patel, 2003), which suggests that music and language draw on a common pool of limited processing resources for integrating incoming elements into syntactic structures. Notations of the stimuli from this study may be downloaded from pbr.psychonomic-journals.org/content/supplemental.


CL1: J. Acoust. Soc. Am. Volume 95, Issue 4, pp. 2277-2280.

Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech

Baer T, Moore BCJ

In a previous study [T. Baer and B. C. J. Moore, J. Acoust. Soc. Am. 94, 1229–1241 (1993)], a spectral smearing technique was used to simulate some of the effects of impaired frequency selectivity so as to assess its influence on speech intelligibility. Results showed that spectral smearing to simulate broadening of the auditory filters by a factor of 3 or 6 had little effect on the intelligibility of speech in quiet but had a large effect on the intelligibility of speech in noise. The present study examines the effect of spectral smearing on the intelligibility of speech in the presence of a single interfering talker. The results were generally consistent with those of the previous study, suggesting that impaired frequency selectivity contributes significantly to the problems experienced by people with cochlear hearing loss when they listen to speech in the presence of interfering sounds.


CL2: Ear Hear. 2004 Jun;25(3):242-50.

Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss.

Buss E, Hall JW 3rd, Grose JH.

OBJECTIVE: The purpose of this study was to examine the effect of sensorineural hearing loss on the ability to make use of fine temporal information and to evaluate the relation between this ability and the ability to recognize speech. DESIGN: Fourteen observers with normal hearing and 12 observers with sensorineural hearing loss were tested on open-set word recognition and on psychophysical tasks thought to reflect use of fine-structure cues: the detection of 2 Hz frequency modulation (FM) and the discrimination of the rate of amplitude modulation (AM) and quasifrequency modulation (QFM). RESULTS: The results showed relatively poor performance for observers with sensorineural hearing loss on both the speech recognition and psychoacoustical tasks. Of particular interest was the finding of significant correlations within the hearing-loss group between speech recognition performance and the psychoacoustical tasks based on frequency modulation, which are thought to reflect the quality of the coding of temporal fine structure. CONCLUSIONS: These results suggest that sensorineural hearing loss may be associated with a reduced ability to use fine temporal information that is coded by neural phase-locking to stimulus fine-structure and that this may contribute to poor speech recognition performance and to poor performance on psychoacoustical tasks that depend on temporal fine structure. Copyright 2004 Lippincott Williams and Wilkins


CL3: J Acoust Soc Am. 2005 Oct;118(4):2519-26.

Consequences of cochlear damage for the detection of interaural phase differences.

Lacher-Fougere S, Demany L.

Thresholds for detecting interaural phase differences (IPDs) in sinusoidally amplitude-modulated pure tones were measured in seven normal-hearing listeners and nine listeners with bilaterally symmetric hearing losses of cochlear origin. The IPDs were imposed either on the carrier signal alone-not the amplitude modulation-or vice versa. The carrier frequency was 250, 500, or 1000 Hz, the modulation frequency 20 or 50 Hz, and the sound pressure level was fixed at 75 dB. A three-interval two-alternative forced choice paradigm was used. For each type of IPD (carrier or modulation), thresholds were on average higher for the hearing-impaired than for the normal listeners. However, the impaired listeners' detection deficit was markedly larger for carrier IPDs than for modulation IPDs. This was not predictable from the effect of hearing loss on the sensation level of the stimuli since, for normal listeners, large reductions of sensation level appeared to be more deleterious to the detection of modulation IPDs than to the detection of carrier IPDs. The results support the idea that one consequence of cochlear damage is a deterioration in the perceptual sensitivity to the temporal fine structure of sounds.


CL4: J Acoust Soc Am. 1989 Dec;86(6):2103-6.

Apparent auditory deprivation effects of late onset: the role of presentation level.

Gatehouse S.

Silman and colleagues [J. Acoust. Soc. Am. 76, 1347-1362 (1984)] have reported an apparent effect of late auditory deprivation; this presents as loss of discrimination over time in the unaided ear of individuals using a single hearing aid fitted in middle age. In a replication of the basic effect, the influence of presentation level was examined in 24 monaurally aided subjects. The effect was reversed at presentation levels below about 75 dB SPL. The ear that is normally aided performs better at high presentation levels, while, at lower presentation levels, the converse is true. Thus it appears that a form of selective adjustment takes place in a particular part of the dynamic range, at least in ears with a dynamic range limited by a sensory hearing loss. If this interpretation is correct, there are important implications for research on perceptual learning and for the time course of evaluation in hearing aid provision.


CL5: J Acoust Soc Am. 1994 Jan;95(1):518-29.

Masking of speech by amplitude-modulated noise.

Gustafsson HA, Arlinger SD.

Department of Technical Audiology, University Hospital, Linköping, Sweden.

The masking of speech by amplitude-modulated and unmodulated speech-spectrum noise has been evaluated by the measurement of monaural speech recognition in such noise on young and elderly subjects with normal-hearing and elderly hearing-impaired subjects with and without a hearing aid. Sinusoidal modulation with frequencies covering the range 2-100 Hz, as well as an irregular modulation generated by the sum of four sinusoids in random phase relation, was used. Modulation degrees were 100%, +/- 6 dB, and +/- 12 dB. Root mean-square sound pressure level was equal for modulated and unmodulated maskers. For the normal-hearing subjects, essentially all types of modulated noise provided some release of speech masking as compared to unmodulated noise. Sinusoidal modulation provided more release of masking than the irregular modulation. The release of masking increased with modulation depth. It is proposed that the number and duration of low-level intervals are essential factors for the degree of masking. The release of masking was found to reach a maximum at a modulation frequency between 10 and 20 Hz for sinusoidal modulation. For elderly hearing-impaired subjects, the release of masking obtained from amplitude modulation was consistently smaller than in the normal-hearing groups, presumably related to changes in auditory temporal resolution caused by the hearing loss. The average speech-to-noise ratio required for 30% correct speech recognition varied greatly between the groups: For young normal-hearing subjects it was -15 dB, for elderly normal-hearing it was -9 dB, for elderly hearing-impaired subjects in the unaided listening condition it was +2 dB and in the aided condition it was +3 dB. The results support the conclusion that within the methodological context of the study, age as well as sensorineural hearing loss, as such, influence speech recognition in noise more than what can be explained by the loss of audibility, according to the audiogram and the masking noise spectrum.


CL6: Science. 1995 Oct 13;270(5234):303-4.

Speech recognition with primarily temporal cues.

Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M.

Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.


CL7: J Acoust Soc Am. 2001 Jul;110(1):529-42.

Effects of degradation of intensity, time, or frequency content on speech intelligibility for normal-hearing and hearing-impaired listeners.

van Schijndel NH, Houtgast T, Festen JM.

Many hearing-impaired listeners suffer from distorted auditory processing capabilities. This study examines which aspects of auditory coding (i.e., intensity, time, or frequency) are distorted and how this affects speech perception. The distortion-sensitivity model is used: The effect of distorted auditory coding of a speech signal is simulated by an artificial distortion, and the sensitivity of speech intelligibility to this artificial distortion is compared for normal-hearing and hearing-impaired listeners. Stimuli (speech plus noise) are wavelet coded using a complex sinusoidal carrier with a Gaussian envelope (1/4 octave bandwidth). Intensity information is distorted by multiplying the modulus of each wavelet coefficient by a random factor. Temporal and spectral information are distorted by randomly shifting the wavelet positions along the temporal or spectral axis, respectively. Measured were (1) detection thresholds for each type of distortion, and (2) speech-reception thresholds for various degrees of distortion. For spectral distortion, hearing-impaired listeners showed increased detection thresholds and were also less sensitive to the distortion with respect to speech perception. For intensity and temporal distortion, this was not observed. Results indicate that a distorted coding of spectral information may be an important factor underlying reduced speech intelligibility for the hearing impaired.


DP1: Psych Science 2008 Jan 19:85-91

Auditory Change Detection. Simple Sounds Are Not Memorized Better Than Complex Sounds.

Demany L, Trost W, Serman M, and Semal C

Previous research has shown that the detectability of a local change in a visual image is essentially independent of the complexity of the image when the interstimulus interval (ISI) is very short, but is limited by a low-capacity memory system when the ISI exceeds 100 ms. In the study reported here, listeners made same/different judgments on pairs of successive ''chords’’ (sums of pure tones with random frequencies). The change to be detected was always a frequency shift in one of the tones, and which tone would change was unpredictable. Performance worsened as the number of tones increased, but this effect was not larger for 2-s ISIs than for 0-ms ISIs. Similar results were obtained when a chord was followed by a single tone that had to be judged as higher or lower than the closest component of the chord. Overall, our data suggest that change detection is based on different mechanisms in audition and vision.


DP2: Brain. 2004 Apr;127(Pt 4):801-10.

Characterization of deficits in pitch perception underlying 'tone deafness'.

Foxton JM, Dean JL, Gee R, Peretz I, Griffiths TD.

Congenital amusia is a disorder characterized by life-long, selective deficits in the perception of music. This study examined pitch-perception abilities in a group of 10 adults with this disorder. Tests were administered that assessed fine-grained pitch perception by determining thresholds both for the detection of continuous and segmented pitch changes, and for the recognition of pitch direction. Tests were also administered that assessed the perception of more complex pitch patterns, using pitch-sequence comparison tasks. In addition, the perceptual organization of pitch was also examined, using stream segregation tasks that assess the assignment of sounds differing in pitch to one or two distinct perceptual sources. In comparison with 10 control subjects, it was found that the participants with congenital amusia exhibited deficits both at the level of detecting fine-grained differences in pitch, and at the level of perceiving patterns in pitch. In contrast, no abnormalities were identified in the perceptual organization of pitch. The pitch deficits identified are able to account for the music perception difficulties in this disorder, and implicate deficient cortical processing.


DP3: J Acoust Soc Am. 2006 Jan;119(1):491-506.

Factors affecting the use of noise-band vocoders as acoustic models for pitch perception in cochlear implants.

Laneau J, Moonen M, Wouters J.

Although in a number of experiments noise-band vocoders have been shown to provide acoustic models for speech perception in cochlear implants (CI), the present study assesses in four experiments whether and under what limitations noise-band vocoders can be used as an acoustic model for pitch perception in CI. The first two experiments examine the effect of spectral smearing on simulated electrode discrimination and fundamental frequency (FO) discrimination. The third experiment assesses the effect of spectral mismatch in an FO-discrimination task with two different vocoders. The fourth experiment investigates the effect of amplitude compression on modulation rate discrimination. For each experiment, the results obtained from normal-hearing subjects presented with vocoded stimuli are compared to results obtained directly from CI recipients. The results show that place pitch sensitivity drops with increased spectral smearing and that place pitch cues for multi-channel stimuli can adequately be mimicked when the discriminability of adjacent channels is adjusted by varying the spectral slopes to match that of CI subjects. The results also indicate that temporal pitch sensitivity is limited for noise-band carriers with low center frequencies and that the absence of a compression function in the vocoder might alter the saliency of the temporal pitch cues.


DP4: Ear Hear. 2008 Jun;29(3):421-34.

Music perception of cochlear implant users compared with that of hearing aid users.

Looi V, McDermott H, McKay C, Hickson L.

OBJECTIVES: To investigate the music perception skills of adult cochlear implant (CI) users in comparison with hearing aid (HA) users who have similar levels of hearing impairment. It was hypothesized that the HA users would perform better than the CI recipients on tests involving pitch, instrument, and melody perception, but similarly for rhythm perception.
DESIGN: Fifteen users of the Nucleus CI system and 15 HA users participated in a series of music perception tests. All subjects were postlingually deafened adults, with the HA subjects being required to meet the current audiological criteria for CI candidacy. A music test battery was designed for the study incorporating four major tasks: (1) discrimination of 38 pairs of rhythms; (2) pitch ranking of one-octave, half-octave, and quarter-octave intervals; (3) instrument recognition incorporating three subtests, each with 12 different instruments or ensembles; and (4) recognition of 10 familiar melodies. Stimuli were presented via direct audio input at comfortable presentation levels. The test battery was administered to each subject on two separate occasions, approximately 4 mo apart.
RESULTS: The results from the rhythm test were 93% correct for the CI group and 94% correct for the HA group; these scores were not significantly different. For the pitch test, there was a significant difference between the HA group and the CI group (p < 0.001), with higher mean scores recorded by the HA group for all three interval sizes. The CI subject group was unable to rank pitches a quarter-octave apart, only scoring at chance level for this interval size. In the instrument recognition test, although there was no significant difference between the mean scores of the two groups, both groups obtained significantly higher scores for the subtest incorporating single instrument stimuli than those incorporating multiple instrumentations (p < 0.001). In the melody test, there was a significant difference between the implantees' mean score of 52% correct and the HA group's mean of 91% (p < 0.001).
CONCLUSIONS: As hypothesized, results from the two groups were almost identical for the rhythm test, with the HA group performing significantly better than the CI group on the pitch and melody tests. However, there was no difference between the groups in their ability to identify musical instruments or ensembles. The results of this study indicate that HA users with similar levels of hearing loss perform at least equal to, if not better than, CI users on these music perception tests. However, despite the differences between scores obtained by the CI and HA subject groups, both these subject groups were largely unable to achieve accurate or effective music perception, regardless of the device they used.


DP5: Psychol Res. 2010 Sep;74(5):437-56.

Context sensitivity and invariance in perception of octave-ambiguous tones.

Repp BH, Thompson JM.

Three experiments investigated the influence of unambiguous (UA) context tones on the perception of octave-ambiguous (OA) tones. In Experiment 1, pairs of OA tones spanning a tritone interval were preceded by pairs of UA tones instantiating a rising or falling interval between the same pitch classes. Despite the inherent ambiguity of OA tritone pairs, most participants showed little or no priming when judging the OA tritone as rising or falling. In Experiments 2 and 3, participants compared the pitch heights of single OA and UA tones representing either the same pitch class or being a tritone apart. These judgments were strongly influenced by the pitch range of the UA tones, but only slightly by the spectral center of the OA tones. Thus, the perceived pitch height of single OA tones is context sensitive, but the perceived relative pitch height of two OA tones, as described in previous research on the "tritone paradox," is largely invariant in UA tone contexts.


DP6: J Assoc Res Otolaryngol. 2010 Dec;11(4):709-24.

Objective and subjective psychophysical measures of auditory stream integration and segregation.

Micheyl C, Oxenham AJ.

The perceptual organization of sound sequences into auditory streams involves the integration of sounds into one stream and the segregation of sounds into separate streams. "Objective" psychophysical measures of auditory streaming can be obtained using behavioral tasks where performance is facilitated by segregation and hampered by integration, or vice versa. Traditionally, these two types of tasks have been tested in separate studies involving different listeners, procedures, and stimuli. Here, we tested subjects in two complementary temporal-gap discrimination tasks involving similar stimuli and procedures. One task was designed so that performance in it would be facilitated by perceptual integration; the other, so that performance would be facilitated by perceptual segregation. Thresholds were measured in both tasks under a wide range of conditions produced by varying three stimulus parameters known to influence stream formation: frequency separation, tone-presentation rate, and sequence length. In addition to these performance-based measures, subjective judgments of perceived segregation were collected in the same listeners under corresponding stimulus conditions. The patterns of results obtained in the two temporal-discrimination tasks, and the relationships between thresholds and perceived-segregation judgments, were mostly consistent with the hypothesis that stream segregation helped performance in one task and impaired performance in the other task. The tasks and stimuli described here may prove useful in future behavioral or neurophysiological experiments, which seek to manipulate and measure neural correlates of auditory streaming while minimizing differences between the physical stimuli.


DP7: J Acoust Soc Am. 2009 Oct;126(4):1975-87.

Auditory stream segregation in cochlear implant listeners: measures based on temporal discrimination and interleaved melody recognition.

Cooper HR, Roberts B.

The evidence that cochlear implant listeners routinely experience stream segregation is limited and equivocal. Streaming in these listeners was explored using tone sequences matched to the center frequencies of the implant's 22 electrodes. Experiment 1 measured temporal discrimination for short (ABA triplet) and longer (12 AB cycles) sequences (tone/silence durations = 60/40 ms). Tone A stimulated electrode 11; tone B stimulated one of 14 electrodes. On each trial, one sequence remained isochronous, and tone B was delayed in the other; listeners had to identify the anisochronous interval. The delay was introduced in the second half of the longer sequences. Prior build-up of streaming should cause thresholds to rise more steeply with increasing electrode separation, but no interaction with sequence length was found. Experiment 2 required listeners to identify which of two target sequences was present when interleaved with distractors (tone/silence durations = 120/80 ms). Accuracy was high for isolated targets, but most listeners performed near chance when loudness-matched distractors were added, even when remote from the target. Only a substantial reduction in distractor level improved performance, and this effect did not interact with target-distractor separation. These results indicate that implantees often do not achieve stream segregation, even in relatively unchallenging tasks.


DP8: J Exp Psychol Hum Percept Perform. 2008 Aug;34(4):1007-16.

Effects of context on auditory stream segregation.

Snyder JS, Carter OL, Lee SK, Hannon EE, Alain C.

The authors examined the effect of preceding context on auditory stream segregation. Low tones (A), high tones (B), and silences (-) were presented in an ABA- pattern. Participants indicated whether they perceived 1 or 2 streams of tones. The A tone frequency was fixed, and the B tone was the same as the A tone or had 1 of 3 higher frequencies. Perception of 2 streams in the current trial increased with greater frequency separation between the A and B tones (Delta f). Larger Delta f in previous trials modified this pattern, causing less streaming in the current trial. This occurred even when listeners were asked to bias their perception toward hearing 1 stream or 2 streams. The effect of previous Delta f was not due to response bias because simply perceiving 2 streams in the previous trial did not cause less streaming in the current trial. Finally, the effect of previous ?f was diminished, though still present, when the silent duration between trials was increased to 5.76 s. The time course of this context effect on streaming implicates the involvement of auditory sensory memory or neural adaptation.


DP9: Curr Biol. 2005 Nov 8;15(21):1943-7.

Mechanisms for allocating auditory attention: an auditory saliency map.

Kayser C, Petkov CI, Lippert M, Logothetis NK.

Our nervous system is confronted with a barrage of sensory stimuli, but neural resources are limited and not all stimuli can be processed to the same extent. Mechanisms exist to bias attention toward the particularly salient events, thereby providing a weighted representation of our environment. Our understanding of these mechanisms is still limited, but theoretical models can replicate such a weighting of sensory inputs and provide a basis for understanding the underlying principles. Here, we describe such a model for the auditory system-an auditory saliency map. We experimentally validate the model on natural acoustical scenarios, demonstrating that it reproduces human judgments of auditory saliency and predicts the detectability of salient sounds embedded in noisy backgrounds. In addition, it also predicts the natural orienting behavior of naive macaque monkeys to the same salient stimuli. The structure of the suggested model is identical to that of successfully used visual saliency maps. Hence, we conclude that saliency is determined either by implementing similar mechanisms in different unisensory pathways or by the same mechanism in multisensory areas. In any case, our results demonstrate that different primate sensory systems rely on common principles for extracting relevant sensory events.


DP10: Curr Opin Neurobiol. 2009 Aug;19(4):430-3.

Representations in auditory cortex.

Hromadka T, Zador AM.

How does auditory cortex represent auditory stimuli, and how do these representations contribute to behavior? Recent experimental evidence suggests that activity in auditory cortex consists of sparse and highly synchronized volleys of activity, observed both in anesthetized and awake animals. Many neurons are capable of remarkably precise activity with very low jitter or spike count variability. Most importantly, animals are capable of exploiting such precise neuronal activity in making sensory decisions. Whether the ability of auditory cortex to exploit fine temporal differences in cortical activity is unique to auditory modality, or represents a general strategy used by cortical circuits remains an open question.


DP11: J Acoust Soc Am. 2008 Feb;123(2):899-909.

Phoneme representation and classification in primary auditory cortex.

Mesgarani N, David SV, Fritz JB, Shamma SA.

A controversial issue in neurolinguistics is whether basic neural auditory representations found in many animals can account for human perception of speech. This question was addressed by examining how a population of neurons in the primary auditory cortex (A1) of the naive awake ferret encodes phonemes and whether this representation could account for the human ability to discriminate them. When neural responses were characterized and ordered by spectral tuning and dynamics, perceptually significant features including formant patterns in vowels and place and manner of articulation in consonants, were readily visualized by activity in distinct neural subpopulations. Furthermore, these responses faithfully encoded the similarity between the acoustic features of these phonemes. A simple classifier trained on the neural representation was able to simulate human phoneme confusion when tested with novel exemplars. These results suggest that A1 responses are sufficiently rich to encode and discriminate phoneme classes and that humans and animals may build upon the same general acoustic representations to learn boundaries for categorical and robust sound classification.


DP12: J Neurosci. 2010 Jun 2;30(22):7604-12.

Cortical representation of natural complex sounds: effects of acoustic features and auditory object category.

Leaver AM, Rauschecker JP.

How the brain processes complex sounds, like voices or musical instrument sounds, is currently not well understood. The features comprising the acoustic profiles of such sounds are thought to be represented by neurons responding to increasing degrees of complexity throughout auditory cortex, with complete auditory "objects" encoded by neurons (or small networks of neurons) in anterior superior temporal regions. Although specialized voice and speech-sound regions have been proposed, it is unclear how other types of complex natural sounds are processed within this object-processing pathway. Using functional magnetic resonance imaging, we sought to demonstrate spatially distinct patterns of category-selective activity in human auditory cortex, independent of semantic content and low-level acoustic features. Category-selective responses were identified in anterior superior temporal regions, consisting of clusters selective for musical instrument sounds and for human speech. An additional subregion was identified that was particularly selective for the acoustic-phonetic content of speech. In contrast, regions along the superior temporal plane closer to primary auditory cortex were not selective for stimulus category, responding instead to specific acoustic features embedded in natural sounds, such as spectral structure and temporal modulation. Our results support a hierarchical organization of the anteroventral auditory-processing stream, with the most anterior regions representing the complete acoustic signature of auditory objects.


DP13: Neuron. 2009 Feb 12;61(3):467-80.

Task difficulty and performance induce diverse adaptive patterns in gain and shape of primary auditory cortical receptive fields.

Atiani S, Elhilali M, David SV, Fritz JB, Shamma SA.

Attention is essential for navigating complex acoustic scenes, when the listener seeks to extract a foreground source while suppressing background acoustic clutter. This study explored the neural correlates of this perceptual ability by measuring rapid changes of spectrotemporal receptive fields (STRFs) in primary auditory cortex during detection of a target tone embedded in noise. Compared with responses in the passive state, STRF gain decreased during task performance in most cells. By contrast, STRF shape changes were excitatory and specific, and were strongest in cells with best frequencies near the target tone. The net effect of these adaptations was to accentuate the representation of the target tone relative to the noise by enhancing responses of near-target cells to the tone during high-signal-to-noise ratio (SNR) tasks while suppressing responses of far-from-target cells to the masking noise in low-SNR tasks. These adaptive STRF changes were largest in high-performance sessions, confirming a close correlation with behavior.


DP14: Neuron. 2010 Jun 24;66(6):937-48.

Adaptation to stimulus statistics in the perception and neural representation of auditory space.

Dahmen JC, Keating P, Nodal FR, Schulz AL, King AJ.

Sensory systems are known to adapt their coding strategies to the statistics of their environment, but little is still known about the perceptual implications of such adjustments. We investigated how auditory spatial processing adapts to stimulus statistics by presenting human listeners and anesthetized ferrets with noise sequences in which interaural level differences (ILD) rapidly fluctuated according to a Gaussian distribution. The mean of the distribution biased the perceived laterality of a subsequent stimulus, whereas the distribution's variance changed the listeners' spatial sensitivity. The responses of neurons in the inferior colliculus changed in line with these perceptual phenomena. Their ILD preference adjusted to match the stimulus distribution mean, resulting in large shifts in rate-ILD functions, while their gain adapted to the stimulus variance, producing pronounced changes in neural sensitivity. Our findings suggest that processing of auditory space is geared toward emphasizing relative spatial differences rather than the accurate representation of absolute position.


JME1: Curr Biol. 2010 Jan 12;20(1):19-24.

Visual enhancement of the information representation in auditory cortex.

Kayser C, Logothetis NK, Panzeri S.

Combining information across different sensory modalities can greatly facilitate our ability to detect, discriminate, or recognize sensory stimuli. Although this process of sensory integration has usually been attributed to classical association cortices, recent work has demonstrated that neuronal activity in early sensory cortices can also be influenced by cross-modal inputs. Here we demonstrate that such "early" multisensory influences enhance the information carried by neurons about multisensory stimuli. By recording in auditory cortex of alert monkeys watching naturalistic audiovisual stimuli, we quantified the effect of visual influences on the trial-to-trial response variability and on the amount of information carried by neural responses. We found that firing rates and precisely timed spike patterns of individual units became more reliable across trials and time when multisensory stimuli were presented, leading to greater encoded stimulus information. Importantly, this multisensory information enhancement was much reduced when the visual stimulus did not match the sound. These results demonstrate that multisensory influences enhance information processing already at early stages in cortex, suggesting that sensory integration is a distributed process, commencing in lower sensory areas and continuing in higher association cortices.


JME2: Nature. 2007 Nov 15;450(7168):425-9.

A synaptic memory trace for cortical receptive field plasticity.

Froemke RC, Merzenich MM, Schreiner CE.

Receptive fields of sensory cortical neurons are plastic, changing in response to alterations of neural activity or sensory experience. In this way, cortical representations of the sensory environment can incorporate new information about the world, depending on the relevance or value of particular stimuli. Neuromodulation is required for cortical plasticity, but it is uncertain how subcortical neuromodulatory systems, such as the cholinergic nucleus basalis, interact with and refine cortical circuits. Here we determine the dynamics of synaptic receptive field plasticity in the adult primary auditory cortex (also known as AI) using in vivo whole-cell recording. Pairing sensory stimulation with nucleus basalis activation shifted the preferred stimuli of cortical neurons by inducing a rapid reduction of synaptic inhibition within seconds, which was followed by a large increase in excitation, both specific to the paired stimulus. Although nucleus basalis was stimulated only for a few minutes, reorganization of synaptic tuning curves progressed for hours thereafter: inhibition slowly increased in an activity-dependent manner to rebalance the persistent enhancement of excitation, leading to a retuned receptive field with new preference for the paired stimulus. This restricted period of disinhibition may be a fundamental mechanism for receptive field plasticity, and could serve as a memory trace for stimuli or episodes that have acquired new behavioural significance.


JME3: J Neurophysiol. 2005 Oct;94(4):2738-47.

Affects of aging on receptive fields in rat primary auditory cortex layer V neurons.

Turner JG, Hughes LF, Caspary DM.

Advanced age is commonly associated with progressive cochlear pathology and central auditory deficits, collectively known as presbycusis. The present study examined central correlates of presbycusis by measuring response properties of primary auditory cortex (AI) layer V neurons in the Fischer Brown Norway rat model. Layer V neurons represent the major output of AI to other cortical and subcortical regions (primarily the inferior colliculus). In vivo single-unit extracellular recordings were obtained from 114 neurons in aged animals (29-33 mo) and compared with 105 layer V neurons in young-adult rats (4-6 mo). Three consecutive repetitions of a pure-tone receptive field map were run for each neuron. Age was associated with fewer neurons exhibiting classic V/U-shaped receptive fields and a greater percentage of neurons with more Complex receptive fields. Receptive fields from neurons in aged rats were also less reliable on successive repetitions of the same stimulus set. Aging was also associated with less firing during the stimulus in V/U-shaped receptive field neurons and more firing during the stimulus in Complex neurons, which were generally associated with inhibited firing in young controls. Finally, neurons in aged rats with Complex receptive fields were more easily driven by current pulses delivered to the soma. Collectively, these findings provide support for the notion that age is associated with diminished signal-to-noise coding by AI layer V neurons and are consistent with other research suggesting that GABAergic neurotransmission in AI may be compromised by aging.


JME4: J Cogn Neurosci. 2008 Jan;20(1):135-52.

Linking cortical spike pattern codes to auditory perception.

Walker KM, Ahmed B, Schnupp JW.

Neurometric analysis has proven to be a powerful tool for studying links between neural activity and perception, especially in visual and somatosensory cortices, but conventional neurometrics are based on a simplistic rate-coding hypothesis that is clearly at odds with the rich and complex temporal spiking patterns evoked by many natural stimuli. In this study, we investigated the possible relationships between temporal spike pattern codes in the primary auditory cortex (A1) and the perceptual detection of subtle changes in the temporal structure of a natural sound. Using a two-alternative forced-choice oddity task, we measured the ability of human listeners to detect local time reversals in a marmoset twitter call. We also recorded responses of neurons in A1 of anesthetized and awake ferrets to these stimuli, and analyzed these responses using a novel neurometric approach that is sensitive to temporal discharge patterns. We found that although spike count-based neurometrics were inadequate to account for behavioral performance on this auditory task, neurometrics based on the temporal discharge patterns of populations of A1 units closely matched the psychometric performance curve, but only if the spiking patterns were resolved at temporal resolutions of 20 msec or better. These results demonstrate that neurometric discrimination curves can be calculated for temporal spiking patterns, and they suggest that such an extension of previous spike count-based approaches is likely to be essential for understanding the neural correlates of the perception of stimuli with a complex temporal structure.


MC1: Cereb Cortex. 2001 Oct;11(10):946-53.

Spectral and temporal processing in human auditory cortex.

Zatorre RJ, Belin P.

We used positron emission tomography to examine the response of human auditory cortex to spectral and temporal variation. Volunteers listened to sequences derived from a standard stimulus, consisting of two pure tones separated by one octave alternating with a random duty cycle. In one series of five scans, spectral information (tone spacing) remained constant while speed of alternation was doubled at each level. In another five scans, speed was kept constant while the number of tones sampled within the octave was doubled at each level, resulting in increasingly fine frequency differences. Results indicated that (i) the core auditory cortex in both hemispheres responded to temporal variation, while the anterior superior temporal areas bilaterally responded to the spectral variation; and (ii) responses to the temporal features were weighted towards the left, while responses to the spectral features were weighted towards the right. These findings confirm the specialization of the left-hemisphere auditory cortex for rapid temporal processing, and indicate that core areas are especially involved in these processes. The results also indicate a complementary hemispheric specialization in right-hemisphere belt cortical areas for spectral processing. The data provide a unifying framework to explain hemispheric asymmetries in processing speech and tonal patterns. We propose that differences exist in the temporal and spectral resolution of corresponding fields in the two hemispheres, and that they may be related to anatomical hemispheric asymmetries in myelination and spacing of cortical columns.


MC2: PLoS Biol. 2007 Oct 23;5(11):e288.

An information theoretic characterisation of auditory encoding.

Overath T, Cusack R, Kumar S, von Kriegstein K, Warren JD, Grube M, Carlyon RP, Griffiths TD.

The entropy metric derived from information theory provides a means to quantify the amount of information transmitted in acoustic streams like speech or music. By systematically varying the entropy of pitch sequences, we sought brain areas where neural activity and energetic demands increase as a function of entropy. Such a relationship is predicted to occur in an efficient encoding mechanism that uses less computational resource when less information is present in the signal: we specifically tested the hypothesis that such a relationship is present in the planum temporale (PT). In two convergent functional MRI studies, we demonstrated this relationship in PT for encoding, while furthermore showing that a distributed fronto-parietal network for retrieval of acoustic information is independent of entropy. The results establish PT as an efficient neural engine that demands less computational resource to encode redundant signals than those with high information content.


MC3: J Cogn Neurosci. 2007 Oct;19(10):1721-33.

Feature- and object-based attentional modulation in the human auditory where pathway.

Krumbholz K, Eickhoff SB, Fink GR.

Attending to a visual stimulus feature, such as color or motion, enhances the processing of that feature in the visual cortex. Moreover, the processing of the attended object's other, unattended, features is also enhanced. Here, we used functional magnetic resonance imaging to show that attentional modulation in the auditory system may also exhibit such feature- and object-specific effects. Specifically, we found that attending to auditory motion increases activity in nonprimary motion-sensitive areas of the auditory cortical "where" pathway. Moreover, activity in these motion-sensitive areas was also increased when attention was directed to a moving rather than a stationary sound object, even when motion was not the attended feature. An analysis of effective connectivity revealed that the motion-specific attentional modulation was brought about by an increase in connectivity between the primary auditory cortex and nonprimary motion-sensitive areas, which, in turn, may have been mediated by the paracingulate cortex in the frontal lobe. The current results indicate that auditory attention can select both objects and features. The finding of feature-based attentional modulation implies that attending to one feature of a sound object does not necessarily entail an exhaustive processing of the object's unattended features.


MC4: PLoS Biol. 2008 Jun 10;6(6):e138.

Gutschalk A, Micheyl C, Oxenham AJ.

Neural correlates of auditory perceptual awareness under informational masking.

Our ability to detect target sounds in complex acoustic backgrounds is often limited not by the ear's resolution, but by the brain's information-processing capacity. The neural mechanisms and loci of this "informational masking" are unknown. We combined magnetoencephalography with simultaneous behavioral measures in humans to investigate neural correlates of informational masking and auditory perceptual awareness in the auditory cortex. Cortical responses were sorted according to whether or not target sounds were detected by the listener in a complex, randomly varying multi-tone background known to produce informational masking. Detected target sounds elicited a prominent, long-latency response (50-250 ms), whereas undetected targets did not. In contrast, both detected and undetected targets produced equally robust auditory middle-latency, steady-state responses, presumably from the primary auditory cortex. These findings indicate that neural correlates of auditory awareness in informational masking emerge between early and late stages of processing within the auditory cortex.


MC5: Curr Biol. 2005 Jun 21;15(12):1108-13.

Directed attention eliminates 'change deafness' in complex auditory scenes.

Eramudugolla R, Irvine DR, McAnally KI, Martin RL, Mattingley JB.

In natural environments that contain multiple sound sources, acoustic energy arising from the different sources sums to produce a single complex waveform at each of the listener's ears. The auditory system must segregate this waveform into distinct streams to permit identification of the objects from which the signals emanate [1]. Although the processes involved in stream segregation are now reasonably well understood [1, 2 and 3], little is known about the nature of our perception of complex auditory scenes. Here, we examined complex scene perception by having listeners detect a discrete change to an auditory scene comprising multiple concurrent naturalistic sounds. We found that listeners were remarkably poor at detecting the disappearance of an individual auditory object when listening to scenes containing more than four objects, but they performed near perfectly when their attention was directed to the identity of a potential change. In the absence of directed attention, this "change deafness" [4] was greater for objects arising from a common location in space than for objects separated in azimuth. Change deafness was also observed for changes in object location, suggesting that it may reflect a general effect of the dependence of human auditory perception on attention.


MC6: Curr Biol. 2010 Jun 22;20(12):1128-32.

Direct recordings of pitch responses from human auditory cortex.

Griffiths TD, Kumar S, Sedley W, Nourski KV, Kawasaki H, Oya H, Patterson RD, Brugge JF, Howard MA.

Pitch is a fundamental percept with a complex relationship to the associated sound structure. Pitch perception requires brain representation of both the structure of the stimulus and the pitch that is perceived. We describe direct recordings of local field potentials from human auditory cortex made while subjects perceived the transition between noise and a noise with a regular repetitive structure in the time domain at the millisecond level called regular-interval noise (RIN). RIN is perceived to have a pitch when the rate is above the lower limit of pitch, at approximately 30 Hz. Sustained time-locked responses are observed to be related to the temporal regularity of the stimulus, commonly emphasized as a relevant stimulus feature in models of pitch perception (e.g., [1]). Sustained oscillatory responses are also demonstrated in the high gamma range (80-120 Hz). The regularity responses occur irrespective of whether the response is associated with pitch perception. In contrast, the oscillatory responses only occur for pitch. Both responses occur in primary auditory cortex and adjacent nonprimary areas. The research suggests that two types of pitch-related activity occur in humans in early auditory cortex: time-locked neural correlates of stimulus regularity and an oscillatory response related to the pitch percept.


MC7: J Neurosci. 2009 Oct 7;29(40):12695-701.

Involvement of the thalamocortical loop in the spontaneous switching of percepts in auditory streaming.

Kondo HM, Kashino M.

Perceptual grouping of successive frequency components, namely, auditory streaming, is essential for auditory scene analysis. Prolonged listening to an unchanging triplet-tone sequence produces a series of illusory switches between a single coherent stream (S1) and two distinct streams (S2). The predominant percept depends on the frequency difference (Deltaf) between high and low tones. Here, we combined the use of different Deltafs with an event-related fMRI design to identify whether the temporal dynamics of brain activity differs depending on the direction of perceptual switches. The results demonstrated that the activity of the medial geniculate body (MGB) in the thalamus occurred earlier during switching from nonpredominant to predominant percepts, whereas that of the auditory cortex (AC) occurred earlier during switching from predominant to nonpredominant percepts, regardless of Deltaf. The asymmetry of temporal precedence indicates that the MGB and AC activations play different roles in perceptual switching and depend on perceptual predominance rather than on S1 and S2 percepts per se. Our results suggest that feedforward and feedback processes in the thalamocortical loop are involved in the formation of percepts in auditory streaming.