[P2 evaluation] Articles

Choisir deux articles dans la liste, provenant de deux intervenants différents (indiqués par leurs initiales). N'oubliez pas de me préciser lequel est pour l'oral et lequel est pour l'écrit. Vu que Barbara Tillmann ne pourra pas assister à l'oral, il serait préférable de choisir ses articles pour l'écrit.


AdC1: Neuroscientist. 2010 Aug;16(4):453-69.

Sensitivity and selectivity of neurons in auditory cortex to the pitch, timbre, and location of sounds.

Bizley JK, Walker KM.

We are able to rapidly recognize and localize the many sounds in our environment. We can describe any of these sounds in terms of various independent "features" such as their loudness, pitch, or position in space. However, we still know surprisingly little about how neurons in the auditory brain, specifically the auditory cortex, might form representations of these perceptual characteristics from the information that the ear provides about sound acoustics. In this article, the authors examine evidence that the auditory cortex is necessary for processing the pitch, timbre, and location of sounds, and document how neurons across multiple auditory cortical fields might represent these as trains of action potentials. They conclude by asking whether neurons in different regions of the auditory cortex might not be simply sensitive to each of these three sound features but whether they might be selective for one of them. The few studies that have examined neural sensitivity to multiple sound attributes provide only limited support for neural selectivity within auditory cortex. Providing an explanation of the neural basis of feature invariance is thus one of the major challenges to sensory neuroscience obtaining the ultimate goal of understanding how neural firing patterns in the brain give rise to perception.


AdC2: Neuroimage. 2010 Jun;51(2):808-16.

The effect of stimulus context on pitch representations in the human auditory cortex.

Garcia D, Hall DA, Plack CJ.

Neuroimaging studies of pitch coding seek to identify pitch-related responses separate from responses to other properties of the stimulus, such as its energy onset, and other general aspects of the listening context. The current study reports the first attempt to evaluate these modulatory influences using functional magnetic resonance imaging (fMRI) measures of cortical pitch representations. Stimulus context was manipulated using a 'classical stimulation paradigm' (whereby successive pitch stimuli were separated by gaps of silence) and a 'continuous stimulation paradigm' (whereby successive pitch stimuli were interspersed with noise to maintain a stable envelope). Pitch responses were measured for two types of pitch-evoking stimuli; a harmonic-complex tone and a complex Huggins pitch. Results for a group of 15 normally hearing listeners revealed that context effects were mostly observed in primary auditory regions, while the most significant pitch responses were localized to posterior nonprimary auditory cortex, specifically planum temporale. Sensitivity to pitch was greater for the continuous stimulation conditions perhaps because they better controlled for concurrent responses to the noise energy onset and reduced the potential problem of a non-linear fMRI response becoming saturated. These results provide support for hierarchical processing within human auditory cortex, with some parts of primary auditory cortex engaged by general auditory energy, some parts of planum temporale specifically responsible for representing pitch information and adjacent regions that are responsible for complex higher-level auditory processing such as representing pitch information as a function of listening context.


AdC3: Curr Opin Neurobiol. 2008 Aug;18(4):452-63.

Music perception, pitch, and the auditory system.

McDermott JH, Oxenham AJ.

The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system. This has been best characterized for those aspects of music that involve pitch. Pitch sequences are heard in terms of relative as well as absolute pitch. Pitch combinations give rise to emergent properties not present in the component notes. In this review we discuss the basic auditory mechanisms contributing to these and other perceptual effects in music.


AdC4: Curr Biol. 2010 Jun 8;20(11):1035-41.

Individual differences reveal the basis of consonance.

McDermott JH, Lehr AJ, Oxenham AJ.

Some combinations of musical notes are consonant (pleasant), whereas others are dissonant (unpleasant), a distinction central to music. Explanations of consonance in terms of acoustics, auditory neuroscience, and enculturation have been debated for centuries. We utilized individual differences to distinguish the candidate theories. We measured preferences for musical chords as well as nonmusical sounds that isolated particular acoustic factors--specifically, the beating and the harmonic relationships between frequency components, two factors that have long been thought to potentially underlie consonance. Listeners preferred stimuli without beats and with harmonic spectra, but across more than 250 subjects, only the preference for harmonic spectra was consistently correlated with preferences for consonant over dissonant chords. Harmonicity preferences were also correlated with the number of years subjects had spent playing a musical instrument, suggesting that exposure to music amplifies preferences for harmonic frequencies because of their musical importance. Harmonic spectra are prominent features of natural sounds, and our results indicate that they also underlie the perception of consonance.


AdC5: Hear Res. 2010 May 10. [Epub ahead of print]

Cortical encoding of pitch: Recent results and open questions.

Walker KM, Bizley JK, King AJ, Schnupp JW.

It is widely appreciated that the key predictor of the pitch of a sound is its periodicity. Neural structures which support pitch perception must therefore be able to reflect the repetition rate of a sound, but this alone is not sufficient. Since pitch is a psychoacoustic property, a putative cortical code for pitch must also be able to account for the relationship between the amount to which a sound is periodic (i.e. its temporal regularity) and the perceived pitch salience, as well as limits in our ability to detect pitch changes or to discriminate rising from falling pitch. Pitch codes must also be robust in the presence of nuisance variables such as loudness or timbre. Here, we review a large body of work on the cortical basis of pitch perception, which illustrates that the distribution of cortical processes that give rise to pitch perception is likely to depend on both the acoustical features and functional relevance of a sound. While previous studies have greatly advanced our understanding, we highlight several open questions regarding the neural basis of pitch perception. These questions can begin to be addressed through a cooperation of investigative efforts across species and experimental techniques, and, critically, by examining the responses of single neurons in behaving animals.


AdC6: Proc Natl Acad Sci U S A. 2011 Oct 18;108(42):17516-20.

Frequency selectivity in Old-World monkeys corroborates sharp cochlear tuning in humans.

Joris PX, Bergevin C, Kalluri R, Mc Laughlin M, Michelet P, van der Heijden M, Shera CA.

Frequency selectivity in the inner ear is fundamental to hearing and is traditionally thought to be similar across mammals. Although direct measurements are not possible in humans, estimates of frequency tuning based on noninvasive recordings of sound evoked from the cochlea (otoacoustic emissions) have suggested substantially sharper tuning in humans but remain controversial. We report measurements of frequency tuning in macaque monkeys, Old-World primates phylogenetically closer to humans than the laboratory animals often taken as models of human hearing (e.g., cats, guinea pigs, chinchillas). We find that measurements of tuning obtained directly from individual auditory-nerve fibers and indirectly using otoacoustic emissions both indicate that at characteristic frequencies above about 500 Hz, peripheral frequency selectivity in macaques is significantly sharper than in these common laboratory animals, matching that inferred for humans above 4-5 kHz. Compared with the macaque, the human otoacoustic estimates thus appear neither prohibitively sharp nor exceptional. Our results validate the use of otoacoustic emissions for noninvasive measurement of cochlear tuning and corroborate the finding of sharp tuning in humans. The results have important implications for understanding the mechanical and neural coding of sound in the human cochlea, and thus for developing strategies to compensate for the degradation of tuning in the hearing-impaired.


AdC7: J Neurosci. 2011 Apr 27;31(17):6414-20.

State-dependent representation of amplitude-modulated noise stimuli in rat auditory cortex.

Marguet SL, Harris KD.

Cortical responses can vary greatly between repeated presentations of an identical stimulus. Here we report that both trial-to-trial variability and faithfulness of auditory cortical stimulus representations depend critically on brain state. A frozen amplitude-modulated white noise stimulus was repeatedly presented while recording neuronal populations and local field potentials (LFPs) in auditory cortex of urethane-anesthetized rats. An information-theoretic measure was used to predict neuronal spiking activity from either the stimulus envelope or simultaneously recorded LFP. Evoked LFPs and spiking more faithfully followed high-frequency temporal modulations when the cortex was in a desynchronized state. In the synchronized state, neural activity was poorly predictable from the stimulus envelope, but the spiking of individual neurons could still be predicted from the ongoing LFP. Our results suggest that although auditory cortical activity remains coordinated as a population in the synchronized state, the ability of continuous auditory stimuli to control this activity is greatly diminished.


BT1: Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12639-43.

Tuning in to musical rhythms: infants learn more readily than adults.

Hannon EE, Trehub SE.

Domain-general tuning processes may guide the acquisition of perceptual knowledge in infancy. Here, we demonstrate that 12-month-old infants show an adult-like, culture-specific pattern of responding to musical rhythms, in contrast to the culture-general responding that is evident at 6 months of age. Nevertheless, brief exposure to foreign music enables 12-month-olds, but not adults, to perceive rhythmic distinctions in foreign musical contexts. These findings may indicate a sensitive period early in life for acquiring rhythm in particular or socially and biologically important structures more generally.


BT2: J Neurosci. 2009 Aug 19;29(33):10215-20.

Tone deafness: a new disconnection syndrome?

Loui P, Alsop D, Schlaug G.

Communicating with one's environment requires efficient neural interaction between action and perception. Neural substrates of sound perception and production are connected by the arcuate fasciculus (AF). Although AF is known to be involved in language, its roles in non-linguistic functions are unexplored. Here, we show that tone-deaf people, with impaired sound perception and production, have reduced AF connectivity. Diffusion tensor tractography and psychophysics were assessed in tone-deaf individuals and matched controls. Abnormally reduced AF connectivity was observed in the tone deaf. Furthermore, we observed relationships between AF and auditory-motor behavior: superior and inferior AF branches predict psychophysically assessed pitch discrimination and sound production perception abilities, respectively. This neural abnormality suggests that tone deafness leads to a reduction in connectivity resulting in pitch-related impairments. Results support a dual-stream anatomy of sound production and perception implicated in vocal communications. By identifying white matter differences and their psychophysical correlates, results contribute to our understanding of how neural connectivity subserves behavior.


BT3: Cognition. 2008 Feb;106(2):975-83.

Songs as an aid for language acquisition. Schön D, Boyer M, Moreno S, Besson M, Peretz I, Kolinsky R.

In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.] have shown that adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. They also showed that a similar learning mechanism operates with musical stimuli [Saffran, J. R., Johnson, R. E. K., Aslin, N., & Newport, E. L. (1999). Abstract Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.]. In this work we combined linguistic and musical information and we compared language learning based on speech sequences to language learning based on sung sequences. We hypothesized that, compared to speech sequences, a consistent mapping of linguistic and musical information would enhance learning. Results confirmed the hypothesis showing a strong learning facilitation of song compared to speech. Most importantly, the present results show that learning a new language, especially in the first learning phase wherein one needs to segment new words, may largely benefit of the motivational and structuring properties of music in song.


BT4: Psychon Bull Rev. 2009 Apr;16(2):374-81.

Making psycholinguistics musical: self-paced reading time evidence for shared processing of linguistic and musical syntax.

Slevc LR, Rosenberg JC, Patel AD.

Linguistic processing, especially syntactic processing, is often considered a hallmark of human cognition; thus, the domain specificity or domain generality of syntactic processing has attracted considerable debate. The present experiments address this issue by simultaneously manipulating syntactic processing demands in language and music. Participants performed self-paced reading of garden path sentences, in which structurally unexpected words cause temporary syntactic processing difficulty. A musical chord accompanied each sentence segment, with the resulting sequence forming a coherent chord progression. When structurally unexpected words were paired with harmonically unexpected chords, participants showed substantially enhanced garden path effects. No such interaction was observed when the critical words violated semantic expectancy or when the critical chords violated timbral expectancy. These results support a prediction of the shared syntactic integration resource hypothesis (Patel, 2003), which suggests that music and language draw on a common pool of limited processing resources for integrating incoming elements into syntactic structures. Notations of the stimuli from this study may be downloaded from pbr.psychonomic-journals.org/content/supplemental.


BT5: Cereb Cortex. 2009 Nov;19(11):2579-94.

The neural architecture of music-evoked autobiographical memories.

Janata P.

Erratum in Cereb Cortex. 2010 Jan;20(1):254-5.

The medial prefrontal cortex (MPFC) is regarded as a region of the brain that supports self-referential processes, including the integration of sensory information with self-knowledge and the retrieval of autobiographical information. I used functional magnetic resonance imaging and a novel procedure for eliciting autobiographical memories with excerpts of popular music dating to one's extended childhood to test the hypothesis that music and autobiographical memories are integrated in the MPFC. Dorsal regions of the MPFC (Brodmann area 8/9) were shown to respond parametrically to the degree of autobiographical salience experienced over the course of individual 30 s excerpts. Moreover, the dorsal MPFC also responded on a second, faster timescale corresponding to the signature movements of the musical excerpts through tonal space. These results suggest that the dorsal MPFC associates music and memories when we experience emotionally salient episodic memories that are triggered by familiar songs from our personal past. MPFC acted in concert with lateral prefrontal and posterior cortices both in terms of tonality tracking and overall responsiveness to familiar and autobiographically salient songs. These findings extend the results of previous autobiographical memory research by demonstrating the spontaneous activation of an autobiographical memory network in a naturalistic task with low retrieval demands.


BT6: Nat Neurosci. 2011 Feb;14(2):257-62.

Anatomically distinct dopamine release during anticipation and experience of peak emotion to music.

Salimpoor VN, Benovoy M, Larcher K, Dagher A, Zatorre RJ.

Music, an abstract stimulus, can arouse feelings of euphoria and craving, similar to tangible rewards that involve the striatal dopaminergic system. Using the neurochemical specificity of [(11)C]raclopride positron emission tomography scanning, combined with psychophysiological measures of autonomic nervous system activity, we found endogenous dopamine release in the striatum at peak emotional arousal during music listening. To examine the time course of dopamine release, we used functional magnetic resonance imaging with the same stimuli and listeners, and found a functional dissociation: the caudate was more involved during the anticipation and the nucleus accumbens was more involved during the experience of peak emotional responses to music. These results indicate that intense pleasure in response to music can lead to dopamine release in the striatal system. Notably, the anticipation of an abstract reward can result in dopamine release in an anatomical pathway distinct from that associated with the peak pleasure itself. Our results help to explain why music is of such high value across all human societies.


CL1: Nat Neurosci. 2012 Oct;15(10):1362-4.

Diminished temporal coding with sensorineural hearing loss emerges in background noise.

Henry KS, Heinz MG.

Behavioral studies in humans suggest that sensorineural hearing loss (SNHL) decreases sensitivity to the temporal structure of sound, but neurophysiological studies in mammals provide little evidence for diminished temporal coding. We found that SNHL in chinchillas degraded peripheral temporal coding in background noise substantially more than in quiet. These results resolve discrepancies between previous studies and help to explain why perceptual difficulties in hearing-impaired listeners often emerge in noisy situations.


CL2: Ear Hear. 2004 Jun;25(3):242-50.

Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss.

Buss E, Hall JW 3rd, Grose JH.

OBJECTIVE: The purpose of this study was to examine the effect of sensorineural hearing loss on the ability to make use of fine temporal information and to evaluate the relation between this ability and the ability to recognize speech. DESIGN: Fourteen observers with normal hearing and 12 observers with sensorineural hearing loss were tested on open-set word recognition and on psychophysical tasks thought to reflect use of fine-structure cues: the detection of 2 Hz frequency modulation (FM) and the discrimination of the rate of amplitude modulation (AM) and quasifrequency modulation (QFM). RESULTS: The results showed relatively poor performance for observers with sensorineural hearing loss on both the speech recognition and psychoacoustical tasks. Of particular interest was the finding of significant correlations within the hearing-loss group between speech recognition performance and the psychoacoustical tasks based on frequency modulation, which are thought to reflect the quality of the coding of temporal fine structure. CONCLUSIONS: These results suggest that sensorineural hearing loss may be associated with a reduced ability to use fine temporal information that is coded by neural phase-locking to stimulus fine-structure and that this may contribute to poor speech recognition performance and to poor performance on psychoacoustical tasks that depend on temporal fine structure. Copyright 2004 Lippincott Williams and Wilkins


CL3: J Acoust Soc Am. 2005 Oct;118(4):2519-26.

Consequences of cochlear damage for the detection of interaural phase differences.

Lacher-Fougere S, Demany L.

Thresholds for detecting interaural phase differences (IPDs) in sinusoidally amplitude-modulated pure tones were measured in seven normal-hearing listeners and nine listeners with bilaterally symmetric hearing losses of cochlear origin. The IPDs were imposed either on the carrier signal alone-not the amplitude modulation-or vice versa. The carrier frequency was 250, 500, or 1000 Hz, the modulation frequency 20 or 50 Hz, and the sound pressure level was fixed at 75 dB. A three-interval two-alternative forced choice paradigm was used. For each type of IPD (carrier or modulation), thresholds were on average higher for the hearing-impaired than for the normal listeners. However, the impaired listeners' detection deficit was markedly larger for carrier IPDs than for modulation IPDs. This was not predictable from the effect of hearing loss on the sensation level of the stimuli since, for normal listeners, large reductions of sensation level appeared to be more deleterious to the detection of modulation IPDs than to the detection of carrier IPDs. The results support the idea that one consequence of cochlear damage is a deterioration in the perceptual sensitivity to the temporal fine structure of sounds.


CL4: J Acoust Soc Am. 1989 Dec;86(6):2103-6.

Apparent auditory deprivation effects of late onset: the role of presentation level.

Gatehouse S.

Silman and colleagues [J. Acoust. Soc. Am. 76, 1347-1362 (1984)] have reported an apparent effect of late auditory deprivation; this presents as loss of discrimination over time in the unaided ear of individuals using a single hearing aid fitted in middle age. In a replication of the basic effect, the influence of presentation level was examined in 24 monaurally aided subjects. The effect was reversed at presentation levels below about 75 dB SPL. The ear that is normally aided performs better at high presentation levels, while, at lower presentation levels, the converse is true. Thus it appears that a form of selective adjustment takes place in a particular part of the dynamic range, at least in ears with a dynamic range limited by a sensory hearing loss. If this interpretation is correct, there are important implications for research on perceptual learning and for the time course of evaluation in hearing aid provision.


CL5: Proc Natl Acad Sci U S A. 2011 Sep 13;108(37):15516-21.

Normal hearing is not enough to guarantee robust encoding of suprathreshold features important in everyday communication.

Ruggles D, Bharadwaj H, Shinn-Cunningham BG.

"Normal hearing" is typically defined by threshold audibility, even though everyday communication relies on extracting key features of easily audible sound, not on sound detection. Anecdotally, many normal-hearing listeners report difficulty communicating in settings where there are competing sound sources, but the reasons for such difficulties are debated: Do these difficulties originate from deficits in cognitive processing, or differences in peripheral, sensory encoding? Here we show that listeners with clinically normal thresholds exhibit very large individual differences on a task requiring them to focus spatial selective auditory attention to understand one speech stream when there are similar, competing speech streams coming from other directions. These individual differences in selective auditory attention ability are unrelated to age, reading span (a measure of cognitive function), and minor differences in absolute hearing threshold; however, selective attention ability correlates with the ability to detect simple frequency modulation in a clearly audible tone. Importantly, we also find that selective attention performance correlates with physiological measures of how well the periodic, temporal structure of sounds above the threshold of audibility are encoded in early, subcortical portions of the auditory pathway. These results suggest that the fidelity of early sensory encoding of the temporal structure in suprathreshold sounds influences the ability to communicate in challenging settings. Tests like these may help tease apart how peripheral and central deficits contribute to communication impairments, ultimately leading to new approaches to combat the social isolation that often ensues.


CL6: Science. 1995 Oct 13;270(5234):303-4.

Speech recognition with primarily temporal cues.

Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M.

Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.


CL7: J Acoust Soc Am. 2001 Jul;110(1):529-42.

Effects of degradation of intensity, time, or frequency content on speech intelligibility for normal-hearing and hearing-impaired listeners.

van Schijndel NH, Houtgast T, Festen JM.

Many hearing-impaired listeners suffer from distorted auditory processing capabilities. This study examines which aspects of auditory coding (i.e., intensity, time, or frequency) are distorted and how this affects speech perception. The distortion-sensitivity model is used: The effect of distorted auditory coding of a speech signal is simulated by an artificial distortion, and the sensitivity of speech intelligibility to this artificial distortion is compared for normal-hearing and hearing-impaired listeners. Stimuli (speech plus noise) are wavelet coded using a complex sinusoidal carrier with a Gaussian envelope (1/4 octave bandwidth). Intensity information is distorted by multiplying the modulus of each wavelet coefficient by a random factor. Temporal and spectral information are distorted by randomly shifting the wavelet positions along the temporal or spectral axis, respectively. Measured were (1) detection thresholds for each type of distortion, and (2) speech-reception thresholds for various degrees of distortion. For spectral distortion, hearing-impaired listeners showed increased detection thresholds and were also less sensitive to the distortion with respect to speech perception. For intensity and temporal distortion, this was not observed. Results indicate that a distorted coding of spectral information may be an important factor underlying reduced speech intelligibility for the hearing impaired.


DP1: Cereb Cortex. 2011 Sep 30. [Epub ahead of print]

Separability and Commonality of Auditory and Visual Bistable Perception.

Kondo HM, Kitagawa N, Kitamura MS, Koizumi A, Nomura M, Kashino M.

It is unclear what neural processes induce individual differences in perceptual organization in different modalities. To examine this issue, the present study used different forms of bistable perception: auditory streaming, verbal transformations, visual plaids, and reversible figures. We performed factor analyses on the number of perceptual switches in the tasks. A 3-factor model provided a better fit to the data than the other possible models. These factors, namely the "auditory," "shape," and "motion" factors, were separable but correlated with each other. We compared the number of perceptual switches among genotype groups to identify the effects of neurotransmitter functions on the factors. We focused on polymorphisms of catechol-O-methyltransferase (COMT) Val(158)Met and serotonin 2A receptor (HTR2A) -1438G/A genes, which are involved in the modulation of dopamine and serotonin, respectively. The number of perceptual switches in auditory streaming and verbal transformations differed among COMT genotype groups, whereas that in reversible figures differed among HTR2A genotype groups. The results indicate that the auditory and shape factors reflect the functions of the dopamine and serotonin systems, respectively. Our findings suggest that the formation and selection of percepts involve neural processes in cortical and subcortical areas.


DP2: PLoS Comput Biol. 2012 Oct;8(10):e1002731.

How recent history affects perception: the normative approach and its heuristic approximation.

Raviv O, Ahissar M, Loewenstein Y.

There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the "contraction bias", in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments.


DP3: J Acoust Soc Am. 2010 Dec;128(6):3634-41

Effects of the use of personal music players on amplitude modulation detection and frequency discrimination.

Vinay SN, Moore BCJM.

Measures of auditory performance were compared for an experimental group who listened regularly to music via personal music players (PMP) and a control group who did not. Absolute thresholds were similar for the two groups for frequencies up to 2 kHz, but the experimental group had slightly but significantly higher thresholds at higher frequencies. Thresholds for the frequency discrimination of pure tones were measured for a sensation level (SL) of 20 dB and center frequencies of 0.25, 0.5, 1, 2, 3, 4, 5, 6, and 8 kHz. Thresholds were significantly higher (worse) for the experimental than for the control group for frequencies from 3 to 8 kHz, but not for lower frequencies. Thresholds for detecting sinusoidal amplitude modulation (AM) were measured for SLs of 10 and 20 dB, using four carrier frequencies 0.5, 3, 4, and 6 kHz, and three modulation frequencies 4, 16, and 50 Hz. Thresholds were significantly lower (better) for the experimental than for the control group for the 4- and 6-kHz carriers, but not for the other carriers. It is concluded that listening to music via PMP can have subtle effects on frequency discrimination and AM detection.


DP4: J Neurophysiol. 2011 May;105(5):1977-83.

Transient bold activity locked to perceptual reversals of auditory streaming in human auditory cortex and inferior colliculus.

Schadwinkel S, Gutschalk A.

Our auditory system separates and tracks temporally interleaved sound sources by organizing them into distinct auditory streams. This streaming phenomenon is partly determined by physical stimulus properties but additionally depends on the internal state of the listener. As a consequence, streaming perception is often bistable and reversals between one- and two-stream percepts may occur spontaneously or be induced by a change of the stimulus. Here, we used functional MRI to investigate perceptual reversals in streaming based on interaural time differences (ITD) that produce a lateralized stimulus perception. Listeners were continuously presented with two interleaved streams, which slowly moved apart and together again. This paradigm produced longer intervals between reversals than stationary bistable stimuli but preserved temporal independence between perceptual reversals and physical stimulus transitions. Results showed prominent transient activity synchronized with the perceptual reversals in and around the auditory cortex. Sustained activity in the auditory cortex was observed during intervals where the ΔITD could potentially produce streaming, similar to previous studies. A localizer-based analysis additionally revealed transient activity time locked to perceptual reversals in the inferior colliculus. These data suggest that neural activity associated with streaming reversals is not limited to the thalamo-cortical system but involves early binaural processing in the auditory midbrain, already.


DP5: J Exp Psychol Hum Percept Perform. 2011 Aug;37(4):1253-62.

An objective measurement of the build-up of auditory streaming and of its modulation by attention.

Thompson SK, Carlyon RP, Cusack R.

Three experiments studied auditory streaming using sequences of alternating "ABA" triplets, where "A" and "B" were 50-ms tones differing in frequency by Df semitones and separated by 75-ms gaps. Experiment 1 showed that detection of a short increase in the gap between a B tone and the preceding A tone, imposed on one ABA triplet, was better when the delay occurred early versus late in the sequence, and for Df = 4 vs. Df = 8. The results of this experiment were consistent with those of a subjective streaming judgment task. Experiment 2 showed that the detection of a delay 12.5 s into a 13.5-s sequence could be improved by requiring participants to perform a task on competing stimuli presented to the other ear for the first 10 s of that sequence. Hence, adding an additional task demand could improve performance via its effect on the perceptual organization of a sound sequence. The results demonstrate that attention affects streaming in an objective task and that the effects of build-up are not completely under voluntary control. In particular, even though build-up can impair performance in an objective task, participants are unable to prevent this from happening.


DP6: Proc Natl Acad Sci U S A. 2011 Aug 2;108(31):12961-6.

Stimulus-specific suppression preserves information in auditory short-term memory.

Linke AC, Vicente-Grabovetsky A, Cusack R.

Philosophers and scientists have puzzled for millennia over how perceptual information is stored in short-term memory. Some have suggested that early sensory representations are involved, but their precise role has remained unclear. The current study asks whether auditory cortex shows sustained frequency-specific activation while sounds are maintained in short-term memory using high-resolution functional MRI (fMRI). Investigating short-term memory representations within regions of human auditory cortex with fMRI has been difficult because of their small size and high anatomical variability between subjects. However, we overcame these constraints by using multivoxel pattern analysis. It clearly revealed frequency-specific activity during the encoding phase of a change detection task, and the degree of this frequency-specific activation was positively related to performance in the task. Although the sounds had to be maintained in memory, activity in auditory cortex was significantly suppressed. Strikingly, patterns of activity in this maintenance period correlated negatively with the patterns evoked by the same frequencies during encoding. Furthermore, individuals who used a rehearsal strategy to remember the sounds showed reduced frequency-specific suppression during the maintenance period. Although negative activations are often disregarded in fMRI research, our findings imply that decreases in blood oxygenation level-dependent response carry important stimulus-specific information and can be related to cognitive processes. We hypothesize that, during auditory change detection, frequency-specific suppression protects short-term memory representations from being overwritten by inhibiting the encoding of interfering sounds.


DP7: Psychol Sci. 2005 Apr;16(4):305-12.

Temporally nonadjacent nonlinguistic sounds affect speech categorization.

Holt LL.

Speech perception is an ecologically important example of the highly context-dependent nature of perception; adjacent speech, and even nonspeech, sounds influence how listeners categorize speech. Some theories emphasize linguistic or articulation-based processes in speech-elicited context effects and peripheral (cochlear) auditory perceptual interactions in non-speech-elicited context effects. The present studies challenge this division. Results of three experiments indicate that acoustic histories composed of sine-wave tones drawn from spectral distributions with different mean frequencies robustly affect speech categorization. These context effects were observed even when the acoustic context temporally adjacent to the speech stimulus was held constant and when more than a second of silence or multiple intervening sounds separated the nonlinguistic acoustic context and speech targets. These experiments indicate that speech categorization is sensitive to statistical distributions of spectral information, even if the distributions are composed of nonlinguistic elements. Acoustic context need be neither linguistic nor local to influence speech perception


DP8: J. Exp. Psychol. 27, 339-368 (1940)

The role of head movements and vestibular and visual cues in sound localization

Wallach, H.

No abstract available.


DP9: Neuron. 2011 Sep 8;71(5):926-40.

Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis.

McDermott JH, Simoncelli EP.

Rainstorms, insect swarms, and galloping horses produce "sound textures"--the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures; however, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation.


DP10: Psych Science 2008 Jan 19:85-91

Auditory Change Detection. Simple Sounds Are Not Memorized Better Than Complex Sounds.

Demany L, Trost W, Serman M, and Semal C

Previous research has shown that the detectability of a local change in a visual image is essentially independent of the complexity of the image when the interstimulus interval (ISI) is very short, but is limited by a low-capacity memory system when the ISI exceeds 100 ms. In the study reported here, listeners made same/different judgments on pairs of successive ''chords’’ (sums of pure tones with random frequencies). The change to be detected was always a frequency shift in one of the tones, and which tone would change was unpredictable. Performance worsened as the number of tones increased, but this effect was not larger for 2-s ISIs than for 0-ms ISIs. Similar results were obtained when a chord was followed by a single tone that had to be judged as higher or lower than the closest component of the chord. Overall, our data suggest that change detection is based on different mechanisms in audition and vision.


DP11: Proc Natl Acad Sci U S A. 2012 Nov 27;109(48):19858-63.

The basis of musical consonance as revealed by congenital amusia.

Cousineau M, McDermott JH, Peretz I.

Some combinations of musical notes sound pleasing and are termed "consonant," but others sound unpleasant and are termed "dissonant." The distinction between consonance and dissonance plays a central role in Western music, and its origins have posed one of the oldest and most debated problems in perception. In modern times, dissonance has been widely believed to be the product of "beating": interference between frequency components in the cochlea that has been believed to be more pronounced in dissonant than consonant sounds. However, harmonic frequency relations, a higher-order sound attribute closely related to pitch perception, has also been proposed to account for consonance. To tease apart theories of musical consonance, we tested sound preferences in individuals with congenital amusia, a neurogenetic disorder characterized by abnormal pitch perception. We assessed amusics' preferences for musical chords as well as for the isolated acoustic properties of beating and harmonicity. In contrast to control subjects, amusic listeners showed no preference for consonance, rating the pleasantness of consonant chords no higher than that of dissonant chords. Amusics also failed to exhibit the normally observed preference for harmonic over inharmonic tones, nor could they discriminate such tones from each other. Despite these abnormalities, amusics exhibited normal preferences and discrimination for stimuli with and without beating. This dissociation indicates that, contrary to classic theories, beating is unlikely to underlie consonance. Our results instead suggest the need to integrate harmonicity as a foundation of music preferences, and illustrate how amusia may be used to investigate normal auditory function.


DP12: Psychol Res. 2010 Sep;74(5):437-56.

Context sensitivity and invariance in perception of octave-ambiguous tones.

Repp BH, Thompson JM.

Three experiments investigated the influence of unambiguous (UA) context tones on the perception of octave-ambiguous (OA) tones. In Experiment 1, pairs of OA tones spanning a tritone interval were preceded by pairs of UA tones instantiating a rising or falling interval between the same pitch classes. Despite the inherent ambiguity of OA tritone pairs, most participants showed little or no priming when judging the OA tritone as rising or falling. In Experiments 2 and 3, participants compared the pitch heights of single OA and UA tones representing either the same pitch class or being a tritone apart. These judgments were strongly influenced by the pitch range of the UA tones, but only slightly by the spectral center of the OA tones. Thus, the perceived pitch height of single OA tones is context sensitive, but the perceived relative pitch height of two OA tones, as described in previous research on the "tritone paradox," is largely invariant in UA tone contexts.


DP13: Neuron. 2010 Jun 24;66(6):937-48.

Adaptation to stimulus statistics in the perception and neural representation of auditory space.

Dahmen JC, Keating P, Nodal FR, Schulz AL, King AJ.

Sensory systems are known to adapt their coding strategies to the statistics of their environment, but little is still known about the perceptual implications of such adjustments. We investigated how auditory spatial processing adapts to stimulus statistics by presenting human listeners and anesthetized ferrets with noise sequences in which interaural level differences (ILD) rapidly fluctuated according to a Gaussian distribution. The mean of the distribution biased the perceived laterality of a subsequent stimulus, whereas the distribution's variance changed the listeners' spatial sensitivity. The responses of neurons in the inferior colliculus changed in line with these perceptual phenomena. Their ILD preference adjusted to match the stimulus distribution mean, resulting in large shifts in rate-ILD functions, while their gain adapted to the stimulus variance, producing pronounced changes in neural sensitivity. Our findings suggest that processing of auditory space is geared toward emphasizing relative spatial differences rather than the accurate representation of absolute position.


DP14: J Assoc Res Otolaryngol. 2010 Dec;11(4):709-24.

Objective and subjective psychophysical measures of auditory stream integration and segregation.

Micheyl C, Oxenham AJ.

The perceptual organization of sound sequences into auditory streams involves the integration of sounds into one stream and the segregation of sounds into separate streams. "Objective" psychophysical measures of auditory streaming can be obtained using behavioral tasks where performance is facilitated by segregation and hampered by integration, or vice versa. Traditionally, these two types of tasks have been tested in separate studies involving different listeners, procedures, and stimuli. Here, we tested subjects in two complementary temporal-gap discrimination tasks involving similar stimuli and procedures. One task was designed so that performance in it would be facilitated by perceptual integration; the other, so that performance would be facilitated by perceptual segregation. Thresholds were measured in both tasks under a wide range of conditions produced by varying three stimulus parameters known to influence stream formation: frequency separation, tone-presentation rate, and sequence length. In addition to these performance-based measures, subjective judgments of perceived segregation were collected in the same listeners under corresponding stimulus conditions. The patterns of results obtained in the two temporal-discrimination tasks, and the relationships between thresholds and perceived-segregation judgments, were mostly consistent with the hypothesis that stream segregation helped performance in one task and impaired performance in the other task. The tasks and stimuli described here may prove useful in future behavioral or neurophysiological experiments, which seek to manipulate and measure neural correlates of auditory streaming while minimizing differences between the physical stimuli.


DP15: Neuron. 2012 Nov 8;76(3):603-15

Sensitivity to complex statistical regularities in rat auditory cortex.

Yaron A, Hershenhoren I, Nelken I.

Neurons in auditory cortex are sensitive to the probability of stimuli: responses to rare stimuli tend to be stronger than responses to common ones. Here, intra- and extracellular recordings from the auditory cortex of halothane-anesthetized rats revealed the existence of a finer sensitivity to the structure of sound sequences. Using oddball sequences in which the order of stimulus presentations is periodic, we found that tones in periodic sequences evoked smaller responses than the same tones in random sequences. Significant reduction in the responses to the common tones in periodic relative to random sequences occurred even when these tones consisted of 95% of the stimuli in the sequence. The reduction in responses paralleled the complexity of the sound sequences and could not be explained by short-term effects of clusters of deviants on succeeding standards. We conclude that neurons in auditory cortex are sensitive to the detailed structure of sound sequences over timescales of minutes.


DP16: Curr Biol. 2005 Nov 8;15(21):1943-7.

Mechanisms for allocating auditory attention: an auditory saliency map.

Kayser C, Petkov CI, Lippert M, Logothetis NK.

Our nervous system is confronted with a barrage of sensory stimuli, but neural resources are limited and not all stimuli can be processed to the same extent. Mechanisms exist to bias attention toward the particularly salient events, thereby providing a weighted representation of our environment. Our understanding of these mechanisms is still limited, but theoretical models can replicate such a weighting of sensory inputs and provide a basis for understanding the underlying principles. Here, we describe such a model for the auditory system-an auditory saliency map. We experimentally validate the model on natural acoustical scenarios, demonstrating that it reproduces human judgments of auditory saliency and predicts the detectability of salient sounds embedded in noisy backgrounds. In addition, it also predicts the natural orienting behavior of naive macaque monkeys to the same salient stimuli. The structure of the suggested model is identical to that of successfully used visual saliency maps. Hence, we conclude that saliency is determined either by implementing similar mechanisms in different unisensory pathways or by the same mechanism in multisensory areas. In any case, our results demonstrate that different primate sensory systems rely on common principles for extracting relevant sensory events.


DP17: J Acoust Soc Am. 2008 Feb;123(2):899-909.

Phoneme representation and classification in primary auditory cortex.

Mesgarani N, David SV, Fritz JB, Shamma SA.

A controversial issue in neurolinguistics is whether basic neural auditory representations found in many animals can account for human perception of speech. This question was addressed by examining how a population of neurons in the primary auditory cortex (A1) of the naive awake ferret encodes phonemes and whether this representation could account for the human ability to discriminate them. When neural responses were characterized and ordered by spectral tuning and dynamics, perceptually significant features including formant patterns in vowels and place and manner of articulation in consonants, were readily visualized by activity in distinct neural subpopulations. Furthermore, these responses faithfully encoded the similarity between the acoustic features of these phonemes. A simple classifier trained on the neural representation was able to simulate human phoneme confusion when tested with novel exemplars. These results suggest that A1 responses are sufficiently rich to encode and discriminate phoneme classes and that humans and animals may build upon the same general acoustic representations to learn boundaries for categorical and robust sound classification.


DP18: J Acoust Soc Am. 2011 Nov;130(5):2891-901.

The effect of hearing loss on the resolution of partials and fundamental frequency discrimination.

Moore BC, Glasberg BR.

The relationship between the ability to hear out partials in complex tones, discrimination of the fundamental frequency (F0) of complex tones, and frequency selectivity was examined for subjects with mild-to-moderate cochlear hearing loss. The ability to hear out partials was measured using a two-interval task. Each interval included a sinusoid followed by a complex tone; one complex contained a partial with the same frequency as the sinusoid, whereas in the other complex that partial was missing. Subjects had to indicate the interval in which the partial was present in the complex. The components in the complex were uniformly spaced on the ERB(N)-number scale. Performance was generally good for the two "edge" partials, but poorer for the inner partials. Performance for the latter improved with increasing spacing. F0 discrimination was measured for a bandpass-filtered complex tone containing low harmonics. The equivalent rectangular bandwidth (ERB) of the auditory filter was estimated using the notched-noise method for center frequencies of 0.5, 1, and 2 kHz. Significant correlations were found between the ability to hear out inner partials, F0 discrimination, and the ERB. The results support the idea that F0 discrimination of tones with low harmonics depends on the ability to resolve the harmonics.


MC1: J Neurosci. 2009 Jul 1;29(26):8447-51.

I heard that coming: event-related potential evidence for stimulus-driven prediction in the auditory system.

Bendixen A, Schroeger E, Winkler I.

The auditory system has been shown to detect predictability in a tone sequence, but does it use the extracted regularities for actually predicting the continuation of the sequence? The present study sought to find evidence for the generation of such predictions. Predictability was manipulated in an isochronous series of tones in which every other tone was a repetition of its predecessor. The existence of predictions was probed by occasionally omitting either the first (unpredictable) or the second (predictable) tone of a same-frequency tone pair. Event-related electrical brain activity elicited by the omission of an unpredictable tone differed from the response to the actual tone right from the tone onset. In contrast, early electrical brain activity elicited by the omission of a predictable tone was quite similar to the response to the actual tone. This suggests that the auditory system preactivates the neural circuits for expected input, using sequential predictions to specifically prepare for future acoustic events.


MC2: PLoS Biol. 2007 Oct 23;5(11):e288.

An information theoretic characterisation of auditory encoding.

Overath T, Cusack R, Kumar S, von Kriegstein K, Warren JD, Grube M, Carlyon RP, Griffiths TD.

The entropy metric derived from information theory provides a means to quantify the amount of information transmitted in acoustic streams like speech or music. By systematically varying the entropy of pitch sequences, we sought brain areas where neural activity and energetic demands increase as a function of entropy. Such a relationship is predicted to occur in an efficient encoding mechanism that uses less computational resource when less information is present in the signal: we specifically tested the hypothesis that such a relationship is present in the planum temporale (PT). In two convergent functional MRI studies, we demonstrated this relationship in PT for encoding, while furthermore showing that a distributed fronto-parietal network for retrieval of acoustic information is independent of entropy. The results establish PT as an efficient neural engine that demands less computational resource to encode redundant signals than those with high information content.


MC3: J Cogn Neurosci. 2007 Oct;19(10):1721-33.

Feature- and object-based attentional modulation in the human auditory where pathway.

Krumbholz K, Eickhoff SB, Fink GR.

Attending to a visual stimulus feature, such as color or motion, enhances the processing of that feature in the visual cortex. Moreover, the processing of the attended object's other, unattended, features is also enhanced. Here, we used functional magnetic resonance imaging to show that attentional modulation in the auditory system may also exhibit such feature- and object-specific effects. Specifically, we found that attending to auditory motion increases activity in nonprimary motion-sensitive areas of the auditory cortical "where" pathway. Moreover, activity in these motion-sensitive areas was also increased when attention was directed to a moving rather than a stationary sound object, even when motion was not the attended feature. An analysis of effective connectivity revealed that the motion-specific attentional modulation was brought about by an increase in connectivity between the primary auditory cortex and nonprimary motion-sensitive areas, which, in turn, may have been mediated by the paracingulate cortex in the frontal lobe. The current results indicate that auditory attention can select both objects and features. The finding of feature-based attentional modulation implies that attending to one feature of a sound object does not necessarily entail an exhaustive processing of the object's unattended features.


MC4: PLoS Biol. 2008 Jun 10;6(6):e138.

Neural correlates of auditory perceptual awareness under informational masking.

Gutschalk A, Micheyl C, Oxenham AJ.

Our ability to detect target sounds in complex acoustic backgrounds is often limited not by the ear's resolution, but by the brain's information-processing capacity. The neural mechanisms and loci of this "informational masking" are unknown. We combined magnetoencephalography with simultaneous behavioral measures in humans to investigate neural correlates of informational masking and auditory perceptual awareness in the auditory cortex. Cortical responses were sorted according to whether or not target sounds were detected by the listener in a complex, randomly varying multi-tone background known to produce informational masking. Detected target sounds elicited a prominent, long-latency response (50-250 ms), whereas undetected targets did not. In contrast, both detected and undetected targets produced equally robust auditory middle-latency, steady-state responses, presumably from the primary auditory cortex. These findings indicate that neural correlates of auditory awareness in informational masking emerge between early and late stages of processing within the auditory cortex.


MC5: Proc Natl Acad Sci U S A. 2012 Jul 17;109(29):11854-9

Emergence of neural encoding of auditory objects while listening to competing speakers.

Ding N, Simon JZ.

A visual scene is perceived in terms of visual objects. Similar ideas have been proposed for the analogous case of auditory scene analysis, although their hypothesized neural underpinnings have not yet been established. Here, we address this question by recording from subjects selectively listening to one of two competing speakers, either of different or the same sex, using magnetoencephalography. Individual neural representations are seen for the speech of the two speakers, with each being selectively phase locked to the rhythm of the corresponding speech stream and from which can be exclusively reconstructed the temporal envelope of that speech stream. The neural representation of the attended speech dominates responses (with latency near 100 ms) in posterior auditory cortex. Furthermore, when the intensity of the attended and background speakers is separately varied over an 8-dB range, the neural representation of the attended speech adapts only to the intensity of that speaker but not to the intensity of the background speaker, suggesting an object-level intensity gain control. In summary, these results indicate that concurrent auditory objects, even if spectrotemporally overlapping and not resolvable at the auditory periphery, are neurally encoded individually in auditory cortex and emerge as fundamental representational units for top-down attentional modulation and bottom-up neural adaptation.


MC6: Neuron. 2007 Sep 20;55(6):985-96.

Cerebral responses to change in spatial location of unattended sounds.

Deouell LY, Heller AS, Malach R, D'Esposito M, Knight RT.

The neural basis of spatial processing in the auditory cortex has been controversial. Human fMRI studies suggest that a part of the planum temporale (PT) is involved in auditory spatial processing, but it was recently argued that this region is active only when the task requires voluntary spatial localization. If this is the case, then this region cannot harbor an ongoing spatial representation of the acoustic environment. In contrast, we show in three fMRI experiments that a region in the human medial PT is sensitive to background auditory spatial changes, even when subjects are not engaged in a spatial localization task, and in fact attend the visual modality. During such times, this area responded to rare location shifts, and even more so when spatial variation increased, consistent with spatially selective adaptation. Thus, acoustic space is represented in the human PT even when sound processing is not required by the ongoing task.


MC7: Nat Neurosci. 2004 Jul;7(7):773-8.

Recalibration of audiovisual simultaneity.

Fujisaki W, Shimojo S, Kashino M, Nishida S.

To perceive the auditory and visual aspects of a physical event as occurring simultaneously, the brain must adjust for differences between the two modalities in both physical transmission time and sensory processing time. One possible strategy to overcome this difficulty is to adaptively recalibrate the simultaneity point from daily experience of audiovisual events. Here we report that after exposure to a fixed audiovisual time lag for several minutes, human participants showed shifts in their subjective simultaneity responses toward that particular lag. This 'lag adaptation' also altered the temporal tuning of an auditory-induced visual illusion, suggesting that adaptation occurred via changes in sensory processing, rather than as a result of a cognitive shift while making task responses. Our findings suggest that the brain attempts to adjust subjective simultaneity across different modalities by detecting and reducing time lags between inputs that likely arise from the same physical events.


MC8: Proc Natl Acad Sci U S A. 2011 Jan 18;108(3):1188-93.

Recovering sound sources from embedded repetition.

McDermott JH, Wrobleski D, Oxenham AJ.

Cocktail parties and other natural auditory environments present organisms with mixtures of sounds. Segregating individual sound sources is thought to require prior knowledge of source properties, yet these presumably cannot be learned unless the sources are segregated first. Here we show that the auditory system can bootstrap its way around this problem by identifying sound sources as repeating patterns embedded in the acoustic input. Due to the presence of competing sounds, source repetition is not explicit in the input to the ear, but it produces temporal regularities that listeners detect and use for segregation. We used a simple generative model to synthesize novel sounds with naturalistic properties. We found that such sounds could be segregated and identified if they occurred more than once across different mixtures, even when the same sounds were impossible to segregate in single mixtures. Sensitivity to the repetition of sound sources can permit their recovery in the absence of other segregation cues or prior knowledge of sounds, and could help solve the cocktail party problem.


SAS1: Nature. 2007 Nov 15;450(7168):425-9.

A synaptic memory trace for cortical receptive field plasticity.

Froemke RC, Merzenich MM, Schreiner CE.

Receptive fields of sensory cortical neurons are plastic, changing in response to alterations of neural activity or sensory experience. In this way, cortical representations of the sensory environment can incorporate new information about the world, depending on the relevance or value of particular stimuli. Neuromodulation is required for cortical plasticity, but it is uncertain how subcortical neuromodulatory systems, such as the cholinergic nucleus basalis, interact with and refine cortical circuits. Here we determine the dynamics of synaptic receptive field plasticity in the adult primary auditory cortex (also known as AI) using in vivo whole-cell recording. Pairing sensory stimulation with nucleus basalis activation shifted the preferred stimuli of cortical neurons by inducing a rapid reduction of synaptic inhibition within seconds, which was followed by a large increase in excitation, both specific to the paired stimulus. Although nucleus basalis was stimulated only for a few minutes, reorganization of synaptic tuning curves progressed for hours thereafter: inhibition slowly increased in an activity-dependent manner to rebalance the persistent enhancement of excitation, leading to a retuned receptive field with new preference for the paired stimulus. This restricted period of disinhibition may be a fundamental mechanism for receptive field plasticity, and could serve as a memory trace for stimuli or episodes that have acquired new behavioural significance.


SAS2: Nat Neurosci. 2004 Sep;7(9):974-81.

Temporal plasticity in the primary auditory cortex induced by operant perceptual learning.

Bao S, Chang EF, Woods J, Merzenich MM.

Processing of rapidly successive acoustic stimuli can be markedly improved by sensory training. To investigate the cortical mechanisms underlying such temporal plasticity, we trained rats in a 'sound maze' in which navigation using only auditory cues led to a target location paired with food reward. In this task, the repetition rate of noise pulses increased as the distance between the rat and target location decreased. After training in the sound maze, neurons in the primary auditory cortex (A1) showed greater responses to high-rate noise pulses and stronger phase-locking of responses to the stimuli; they also showed shorter post-stimulation suppression and stronger rebound activation. These improved temporal dynamics transferred to trains of pure-tone pips. Control animals that received identical sound stimulation but were given free access to food showed the same results as naive rats. We conclude that this auditory perceptual learning results in improvements in temporal processing, which may be mediated by enhanced cortical response dynamics.


SAS3: Nat Neurosci. 2010 Mar;13(3):353-60.

Functional organization and population dynamics in the mouse primary auditory cortex.

Rothschild G, Nelken I, Mizrahi A.

Cortical processing of auditory stimuli involves large populations of neurons with distinct individual response profiles. However, the functional organization and dynamics of local populations in the auditory cortex have remained largely unknown. Using in vivo two-photon calcium imaging, we examined the response profiles and network dynamics of layer 2/3 neurons in the primary auditory cortex (A1) of mice in response to pure tones. We found that local populations in A1 were highly heterogeneous in the large-scale tonotopic organization. Despite the spatial heterogeneity, the tendency of neurons to respond together (measured as noise correlation) was high on average. This functional organization and high levels of noise correlations are consistent with the existence of partially overlapping cortical subnetworks. Our findings may account for apparent discrepancies between ordered large-scale organization and local heterogeneity.


SAS4: Nat Neurosci. 2011 Jan;14(1):108-14

Auditory cortex spatial sensitivity sharpens during task performance.

Lee CC, Middlebrooks JC.

Activity in the primary auditory cortex (A1) is essential for normal sound localization behavior, but previous studies of the spatial sensitivity of neurons in A1 have found broad spatial tuning. We tested the hypothesis that spatial tuning sharpens when an animal engages in an auditory task. Cats performed a task that required evaluation of the locations of sounds and one that required active listening, but in which sound location was irrelevant. Some 26-44% of the units recorded in A1 showed substantially sharpened spatial tuning during the behavioral tasks as compared with idle conditions, with the greatest sharpening occurring during the location-relevant task. Spatial sharpening occurred on a scale of tens of seconds and could be replicated multiple times in ∼1.5-h test sessions. Sharpening resulted primarily from increased suppression of responses to sounds at least-preferred locations. That and an observed increase in latencies suggest an important role of inhibitory mechanisms.


SAS5: Nature. 2012 May 10;485(7397):233-6

Selective cortical representation of attended speaker in multi-talker speech perception.

Mesgarani N, Chang EF.

Humans possess a remarkable ability to attend to a single speaker's voice in a multi-talker background. How the auditory system manages to extract intelligible speech under such acoustically complex and adverse listening conditions is not known, and, indeed, it is not clear how attended speech is internally represented. Here, using multi-electrode surface recordings from the cortex of subjects engaged in a listening task with two simultaneous speakers, we demonstrate that population responses in non-primary human auditory cortex encode critical features of attended speech: speech spectrograms reconstructed based on cortical responses to the mixture of speakers reveal the salient spectral and temporal features of the attended speaker, as if subjects were listening to that speaker alone. A simple classifier trained solely on examples of single speakers can decode both attended words and speaker identity. We find that task performance is well predicted by a rapid increase in attention-modulated neural selectivity across both single-electrode and population-level cortical responses. These findings demonstrate that the cortical representation of speech does not merely reflect the external acoustic environment, but instead gives rise to the perceptual aspects relevant for the listener's intended goal.