[P2 evaluation] Articles

Choisir deux articles dans la liste, provenant de deux intervenants différents. N'oubliez pas de me préciser lequel est pour l'oral et lequel est pour l'écrit. Vu que Barbara Tillmann ne pourra pas assister à l'oral, il serait préférable de choisir ses articles pour l'écrit.


AdC1: Nature. 2008 Jan 10;451(7175):197-201.

Ultra-fine frequency tuning revealed in single neurons of human auditory cortex.

Bitterman Y, Mukamel R, Malach R, Fried I, Nelken I.

Just-noticeable differences of physical parameters are often limited by the resolution of the peripheral sensory apparatus. Thus, two-point discrimination in vision is limited by the size of individual photoreceptors. Frequency selectivity is a basic property of neurons in the mammalian auditory pathway. However, just-noticeable differences of frequency are substantially smaller than the bandwidth of the peripheral sensors. Here we report that frequency tuning in single neurons recorded from human auditory cortex in response to random-chord stimuli is far narrower than that typically described in any other mammalian species (besides bats), and substantially exceeds that attributed to the human auditory periphery. Interestingly, simple spectral filter models failed to predict the neuronal responses to natural stimuli, including speech and music. Thus, natural sounds engage additional processing mechanisms beyond the exquisite frequency tuning probed by the random-chord stimuli.


AdC2: J Acoust Soc Am. 2001 Sep;110(3 Pt 1):1498-504.

Melody recognition using three types of dichotic-pitch stimulus.

Akeroyd MA, Moore BC, Moore GA.

The recognition of 10 different 16-note melodies, constructed using either dichotic-pitch stimuli or diotic pure-tone stimuli, was measured. The dichotic pitches were created by placing a frequency-dependent transition in the interaural phase of a noise burst. Three different configurations for the transition were used in order to give Huggins pitch, binaural-edge pitch, and binaural-coherence-edge pitch. Forty-nine inexperienced listeners participated. The melodies evoked by the dichotic stimuli were consistently identified well in the first block of trials, indicating that the sensation of dichotic pitch was relatively immediate and did not require prolonged listening experience. There were only small improvements across blocks of trials. The mean scores were 97% (pure tones), 93% (Huggins pitch), 89% (binaural-edge pitch), and 77% (binaural-coherence-edge pitch). All pairwise differences were statistically significant, indicating that Huggins pitch was the most salient of the dichotic pitches and binaural-coherence-edge pitch was weakest. To account for these differences in salience, a simulation of lateral inhibition was applied to the recovered spectrum generated by the modified equalization cancellation model [J. F. Culling, A. Q. Summerfield, and D. H. Marshall, J. Acoust. Soc. Am. 103, 3509-3526 (1998)]. The height of the peak in the resulting "edge-enhanced" recovered spectrum reflected the relative strength of the different dichotic pitches.


AdC3: Hear Res. 2007 Nov;233(1-2):108-16. Epub 2007 Sep 5.

Temporal integration in absolute identification of musical pitch.

Hsieh IH, Saberi K.

The effect of stimulus duration on absolute identification of musical pitch was measured in a single-interval 12-alternative forced-choice task. Stimuli consisted of pure tones selected randomly on each trial from a set of 60 logarithmically spaced musical note frequencies from 65.4 to 1975.5Hz (C2-B6). Stimulus durations were 5, 10, 25, 50, 100, and 1000ms. Six absolute-pitch musicians identified the pitch of pure tones without feedback, reference sounds, or practice trials. Results showed that a 5ms stimulus is sufficient for producing statistically significant above chance performance. Performance monotonically increased up to the longest duration tested (1000ms). Higher octave stimuli produced better performance, though the rate of improvement declined with increasing octave number. Normalization by the number of waveform cycles showed that 4cycles are sufficient for absolute-pitch identification. Restricting stimuli to a fixed-cycle waveform instead of a fixed-duration still produced monotonic improvements in performance as a function of stimulus octave, demonstrating that better performance at higher frequencies does not exclusively result from a larger number of waveform cycles. Several trends in the data were well predicted by an autocorrelation model of pitch extraction, though the model outperformed observed performance at short durations suggesting an inability to make optimal use of available periodicity information in very brief tones.


AdC4: J Acoust Soc Am. 2006 Dec;120(6):3907-15.

Individual differences in the sensitivity to pitch direction.

Semal C, Demany L.

It is commonly assumed that one can always assign a direction-upward or downward-to a percept of pitch change. The present study shows that this is true for some, but not all, listeners. Frequency difference limens (FDLs, in cents) for pure tones roved in frequency were measured in two conditions. In one condition, the task was to detect frequency changes; in the other condition, the task was to identify the direction of frequency changes. For three listeners, the identification FDL was about 1.5 times smaller than the detection FDL, as predicted (counterintuitively) by signal detection theory under the assumption that performance in the two conditions was limited by one and the same internal noise. For three other listeners, however, the identification FDL was much larger than the detection FDL. The latter listeners had relatively high detection FDLs. They had no difficulty in identifying the direction of just-detectable changes in intensity, or in the frequency of amplitude modulation. Their difficulty in perceiving the direction of small frequency/pitch changes showed up not only when the task required absolute judgments of direction, but also when the directions of two successive frequency changes had to be judged as identical or different.


AdC5: J Acoust Soc Am. 2006 Aug;120(2):957-65.

Effect of noise on the detectability and fundamental frequency discrimination of complex tones.

Gockel H, Moore BC, Plack CJ, Carlyon RP.

Percent correct performance for discrimination of the fundamental frequency (0) of a complex tone was measured as a function of the level of a background pink noise (using fixed values of the difference in F0, deltaF0) and compared with percent correct performance for detection of the complex tone in noise, again as a function of noise level. The tone included some low, resolvable components, but not the fundamental component. The results were used to test the hypothesis that the worsening in F0 discrimination with increasing noise level was caused by the reduced detectability of the tone rather than by reduced precision of the internal representation of F0. For small values of deltaF0, the hypothesis was rejected because measured performance fell below that predicted by the hypothesis. However, this was true only for high noise levels, within 2-4.5 dB of the level required for masked threshold. The results indicate that the mechanism for extracting the F0 of a complex tone with resolved harmonics is remarkably robust. They also indicate that adding a background noise to a complex tone containing resolved harmonics is not a good means for equating its pitch salience with that of a complex tone containing only unresolved harmonics.


AdC6: PLoS Biol. 2008 Jan;6(1):e16.

Sparse representation of sounds in the unanesthetized auditory cortex.

Hromádka T, Deweese MR, Zador AM.

How do neuronal populations in the auditory cortex represent acoustic stimuli? Although sound-evoked neural responses in the anesthetized auditory cortex are mainly transient, recent experiments in the unanesthetized preparation have emphasized subpopulations with other response properties. To quantify the relative contributions of these different subpopulations in the awake preparation, we have estimated the representation of sounds across the neuronal population using a representative ensemble of stimuli. We used cell-attached recording with a glass electrode, a method for which single-unit isolation does not depend on neuronal activity, to quantify the fraction of neurons engaged by acoustic stimuli (tones, frequency modulated sweeps, white-noise bursts, and natural stimuli) in the primary auditory cortex of awake head-fixed rats. We find that the population response is sparse, with stimuli typically eliciting high firing rates (>20 spikes/second) in less than 5% of neurons at any instant. Some neurons had very low spontaneous firing rates (<0.01 spikes/second). At the other extreme, some neurons had driven rates in excess of 50 spikes/second. Interestingly, the overall population response was well described by a lognormal distribution, rather than the exponential distribution that is often reported. Our results represent, to our knowledge, the first quantitative evidence for sparse representations of sounds in the unanesthetized auditory cortex. Our results are compatible with a model in which most neurons are silent much of the time, and in which representations are composed of small dynamic subsets of highly active neurons.


AdC7: Curr Opin Neurobiol. 2009 Aug;19(4):430-3.

Representations in auditory cortex.

Hromádka T, Zador AM.

How does auditory cortex represent auditory stimuli, and how do these representations contribute to behavior? Recent experimental evidence suggests that activity in auditory cortex consists of sparse and highly synchronized volleys of activity, observed both in anesthetized and awake animals. Many neurons are capable of remarkably precise activity with very low jitter or spike count variability. Most importantly, animals are capable of exploiting such precise neuronal activity in making sensory decisions. Whether the ability of auditory cortex to exploit fine temporal differences in cortical activity is unique to auditory modality, or represents a general strategy used by cortical circuits remains an open question.


AdC8: J Acoust Soc Am. 2009 Jun;125(6):3865-70.

Interaural correlation and the binaural summation of loudness.

Edmonds BA, Culling JF.

The effect of interaural correlation (rho) on the loudness for noisebands was measured using a loudness-matching task in naive listeners. The task involved a sequence of loudness comparisons for which the intensity of one stimulus in a given comparison was varied using a one-up-one-down adaptive rule. The task provided an estimate of the level difference (in decibels) for which two stimulus conditions have equal loudness, giving measures of loudness difference in equivalent decibel units (dB(equiv)). Concurrent adaptive tracks measured loudness differences between rho=1, 0, and -1 and between these binaural stimuli and the monaural case for various noisebands. For all noisebands, monaural stimuli required approximately 6 dB higher levels than rho=1 for equal loudness. For most noisebands, rho=1 and rho=-1 were almost equal in loudness, with rho=-1 being slightly louder in the majority of measurements, while rho=0 was about 2 dB(equiv) louder than rho=1 or rho=-1. However, noisebands with significant high-frequency energy showed smaller differences: for 3745-4245 Hz, rho=0 was only about 0.85 dB(equiv) louder than rho=+/-1, and for 100-5000 Hz it was non-significantly louder (perhaps 0.7 dB(equiv)).


AdC9: Hear Res. 2008 Apr;238(1-2):49-57.

Variations on a Dexterous theme: peripheral time-intensity trading.

Joris PX, Michelet P, Franken TP, McLaughlin M.

Sound pressure level changes can affect the timing of spiketrains. Timing of spiketrains is critical for sensitivity to interaural timing differences (ITDs). Interaural level differences (ILDs) can therefore affect the ITD cue. It has been hypothesized that ILDs may be coded indirectly through a peripheral conversion of level to time (but it should be cautioned that the changes in phase with SPL in low-CF AN fibers of the cat are more complicated) (Jeffress, L.A., 1948. A place theory of sound localization. J. Comp. Physiol. Psychol. 41, 35-39). We tested this conversion by recording from auditory nerve fibers to broadband noise at different SPLs. For each fiber, correlograms were constructed to compare timing to fine-structure across SPLs. We find generally a decrease in the time delay between spikes and the stimulus with increasing SPL. However, the magnitudes of the shift in time are surprisingly small, and dependent on characteristic frequency (CF): the largest shifts are approximately 10 micros/dB and occur at the lowest CFs. Nevertheless, the effects of level on spike timing are systematic and of a magnitude to which the binaural system is sensitive. Thus, even though the results indicate that ILD is not traded for ITD in a simple way, the possibility that low-frequency ILDs affect the binaural percept via a peripheral level-to-time conversion cannot be excluded.


BT1: Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12639-43.

Tuning in to musical rhythms: infants learn more readily than adults.

Hannon EE, Trehub SE.

Domain-general tuning processes may guide the acquisition of perceptual knowledge in infancy. Here, we demonstrate that 12-month-old infants show an adult-like, culture-specific pattern of responding to musical rhythms, in contrast to the culture-general responding that is evident at 6 months of age. Nevertheless, brief exposure to foreign music enables 12-month-olds, but not adults, to perceive rhythmic distinctions in foreign musical contexts. These findings may indicate a sensitive period early in life for acquiring rhythm in particular or socially and biologically important structures more generally.


BT2: J Neurosci. 2009 Aug 19;29(33):10215-20.

Tone deafness: a new disconnection syndrome?

Loui P, Alsop D, Schlaug G.

Communicating with one's environment requires efficient neural interaction between action and perception. Neural substrates of sound perception and production are connected by the arcuate fasciculus (AF). Although AF is known to be involved in language, its roles in non-linguistic functions are unexplored. Here, we show that tone-deaf people, with impaired sound perception and production, have reduced AF connectivity. Diffusion tensor tractography and psychophysics were assessed in tone-deaf individuals and matched controls. Abnormally reduced AF connectivity was observed in the tone deaf. Furthermore, we observed relationships between AF and auditory-motor behavior: superior and inferior AF branches predict psychophysically assessed pitch discrimination and sound production perception abilities, respectively. This neural abnormality suggests that tone deafness leads to a reduction in connectivity resulting in pitch-related impairments. Results support a dual-stream anatomy of sound production and perception implicated in vocal communications. By identifying white matter differences and their psychophysical correlates, results contribute to our understanding of how neural connectivity subserves behavior.


BT3: Cognition. 2008 Feb;106(2):975-83.

Songs as an aid for language acquisition. Schön D, Boyer M, Moreno S, Besson M, Peretz I, Kolinsky R.

In previous research, Saffran and colleagues [Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928; Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.] have shown that adults and infants can use the statistical properties of syllable sequences to extract words from continuous speech. They also showed that a similar learning mechanism operates with musical stimuli [Saffran, J. R., Johnson, R. E. K., Aslin, N., & Newport, E. L. (1999). Abstract Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.]. In this work we combined linguistic and musical information and we compared language learning based on speech sequences to language learning based on sung sequences. We hypothesized that, compared to speech sequences, a consistent mapping of linguistic and musical information would enhance learning. Results confirmed the hypothesis showing a strong learning facilitation of song compared to speech. Most importantly, the present results show that learning a new language, especially in the first learning phase wherein one needs to segment new words, may largely benefit of the motivational and structuring properties of music in song.


BT4: Psychon Bull Rev. 2009 Apr;16(2):374-81.

Making psycholinguistics musical: self-paced reading time evidence for shared processing of linguistic and musical syntax.

Slevc LR, Rosenberg JC, Patel AD.

Linguistic processing, especially syntactic processing, is often considered a hallmark of human cognition; thus, the domain specificity or domain generality of syntactic processing has attracted considerable debate. The present experiments address this issue by simultaneously manipulating syntactic processing demands in language and music. Participants performed self-paced reading of garden path sentences, in which structurally unexpected words cause temporary syntactic processing difficulty. A musical chord accompanied each sentence segment, with the resulting sequence forming a coherent chord progression. When structurally unexpected words were paired with harmonically unexpected chords, participants showed substantially enhanced garden path effects. No such interaction was observed when the critical words violated semantic expectancy or when the critical chords violated timbral expectancy. These results support a prediction of the shared syntactic integration resource hypothesis (Patel, 2003), which suggests that music and language draw on a common pool of limited processing resources for integrating incoming elements into syntactic structures. Notations of the stimuli from this study may be downloaded from pbr.psychonomic-journals.org/content/supplemental.


CL1: J. Acoust. Soc. Am. Volume 95, Issue 4, pp. 2277-2280.

Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech

Baer T, Moore BCJ

In a previous study [T. Baer and B. C. J. Moore, J. Acoust. Soc. Am. 94, 1229–1241 (1993)], a spectral smearing technique was used to simulate some of the effects of impaired frequency selectivity so as to assess its influence on speech intelligibility. Results showed that spectral smearing to simulate broadening of the auditory filters by a factor of 3 or 6 had little effect on the intelligibility of speech in quiet but had a large effect on the intelligibility of speech in noise. The present study examines the effect of spectral smearing on the intelligibility of speech in the presence of a single interfering talker. The results were generally consistent with those of the previous study, suggesting that impaired frequency selectivity contributes significantly to the problems experienced by people with cochlear hearing loss when they listen to speech in the presence of interfering sounds.


CL2: Ear Hear. 2004 Jun;25(3):242-50.

Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss.

Buss E, Hall JW 3rd, Grose JH.

OBJECTIVE: The purpose of this study was to examine the effect of sensorineural hearing loss on the ability to make use of fine temporal information and to evaluate the relation between this ability and the ability to recognize speech. DESIGN: Fourteen observers with normal hearing and 12 observers with sensorineural hearing loss were tested on open-set word recognition and on psychophysical tasks thought to reflect use of fine-structure cues: the detection of 2 Hz frequency modulation (FM) and the discrimination of the rate of amplitude modulation (AM) and quasifrequency modulation (QFM). RESULTS: The results showed relatively poor performance for observers with sensorineural hearing loss on both the speech recognition and psychoacoustical tasks. Of particular interest was the finding of significant correlations within the hearing-loss group between speech recognition performance and the psychoacoustical tasks based on frequency modulation, which are thought to reflect the quality of the coding of temporal fine structure. CONCLUSIONS: These results suggest that sensorineural hearing loss may be associated with a reduced ability to use fine temporal information that is coded by neural phase-locking to stimulus fine-structure and that this may contribute to poor speech recognition performance and to poor performance on psychoacoustical tasks that depend on temporal fine structure. Copyright 2004 Lippincott Williams and Wilkins


CL3: J Acoust Soc Am. 2005 Oct;118(4):2519-26.

Consequences of cochlear damage for the detection of interaural phase differences.

Lacher-Fougere S, Demany L.

Thresholds for detecting interaural phase differences (IPDs) in sinusoidally amplitude-modulated pure tones were measured in seven normal-hearing listeners and nine listeners with bilaterally symmetric hearing losses of cochlear origin. The IPDs were imposed either on the carrier signal alone-not the amplitude modulation-or vice versa. The carrier frequency was 250, 500, or 1000 Hz, the modulation frequency 20 or 50 Hz, and the sound pressure level was fixed at 75 dB. A three-interval two-alternative forced choice paradigm was used. For each type of IPD (carrier or modulation), thresholds were on average higher for the hearing-impaired than for the normal listeners. However, the impaired listeners' detection deficit was markedly larger for carrier IPDs than for modulation IPDs. This was not predictable from the effect of hearing loss on the sensation level of the stimuli since, for normal listeners, large reductions of sensation level appeared to be more deleterious to the detection of modulation IPDs than to the detection of carrier IPDs. The results support the idea that one consequence of cochlear damage is a deterioration in the perceptual sensitivity to the temporal fine structure of sounds.


CL4: J Acoust Soc Am. 1989 Dec;86(6):2103-6.

Apparent auditory deprivation effects of late onset: the role of presentation level.

Gatehouse S.

Silman and colleagues [J. Acoust. Soc. Am. 76, 1347-1362 (1984)] have reported an apparent effect of late auditory deprivation; this presents as loss of discrimination over time in the unaided ear of individuals using a single hearing aid fitted in middle age. In a replication of the basic effect, the influence of presentation level was examined in 24 monaurally aided subjects. The effect was reversed at presentation levels below about 75 dB SPL. The ear that is normally aided performs better at high presentation levels, while, at lower presentation levels, the converse is true. Thus it appears that a form of selective adjustment takes place in a particular part of the dynamic range, at least in ears with a dynamic range limited by a sensory hearing loss. If this interpretation is correct, there are important implications for research on perceptual learning and for the time course of evaluation in hearing aid provision.


CL5: J Acoust Soc Am. 1994 Jan;95(1):518-29.

Masking of speech by amplitude-modulated noise.

Gustafsson HA, Arlinger SD.

Department of Technical Audiology, University Hospital, Linköping, Sweden.

The masking of speech by amplitude-modulated and unmodulated speech-spectrum noise has been evaluated by the measurement of monaural speech recognition in such noise on young and elderly subjects with normal-hearing and elderly hearing-impaired subjects with and without a hearing aid. Sinusoidal modulation with frequencies covering the range 2-100 Hz, as well as an irregular modulation generated by the sum of four sinusoids in random phase relation, was used. Modulation degrees were 100%, +/- 6 dB, and +/- 12 dB. Root mean-square sound pressure level was equal for modulated and unmodulated maskers. For the normal-hearing subjects, essentially all types of modulated noise provided some release of speech masking as compared to unmodulated noise. Sinusoidal modulation provided more release of masking than the irregular modulation. The release of masking increased with modulation depth. It is proposed that the number and duration of low-level intervals are essential factors for the degree of masking. The release of masking was found to reach a maximum at a modulation frequency between 10 and 20 Hz for sinusoidal modulation. For elderly hearing-impaired subjects, the release of masking obtained from amplitude modulation was consistently smaller than in the normal-hearing groups, presumably related to changes in auditory temporal resolution caused by the hearing loss. The average speech-to-noise ratio required for 30% correct speech recognition varied greatly between the groups: For young normal-hearing subjects it was -15 dB, for elderly normal-hearing it was -9 dB, for elderly hearing-impaired subjects in the unaided listening condition it was +2 dB and in the aided condition it was +3 dB. The results support the conclusion that within the methodological context of the study, age as well as sensorineural hearing loss, as such, influence speech recognition in noise more than what can be explained by the loss of audibility, according to the audiogram and the masking noise spectrum.


CL6: Science. 1995 Oct 13;270(5234):303-4.

Speech recognition with primarily temporal cues.

Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M.

Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.


CL7: J Acoust Soc Am. 2001 Jul;110(1):529-42.

Effects of degradation of intensity, time, or frequency content on speech intelligibility for normal-hearing and hearing-impaired listeners.

van Schijndel NH, Houtgast T, Festen JM.

Many hearing-impaired listeners suffer from distorted auditory processing capabilities. This study examines which aspects of auditory coding (i.e., intensity, time, or frequency) are distorted and how this affects speech perception. The distortion-sensitivity model is used: The effect of distorted auditory coding of a speech signal is simulated by an artificial distortion, and the sensitivity of speech intelligibility to this artificial distortion is compared for normal-hearing and hearing-impaired listeners. Stimuli (speech plus noise) are wavelet coded using a complex sinusoidal carrier with a Gaussian envelope (1/4 octave bandwidth). Intensity information is distorted by multiplying the modulus of each wavelet coefficient by a random factor. Temporal and spectral information are distorted by randomly shifting the wavelet positions along the temporal or spectral axis, respectively. Measured were (1) detection thresholds for each type of distortion, and (2) speech-reception thresholds for various degrees of distortion. For spectral distortion, hearing-impaired listeners showed increased detection thresholds and were also less sensitive to the distortion with respect to speech perception. For intensity and temporal distortion, this was not observed. Results indicate that a distorted coding of spectral information may be an important factor underlying reduced speech intelligibility for the hearing impaired.


DP1: J Acoust Soc Am. 2008 Oct;124(4):2251-62.

On the ability to discriminate Gaussian-noise tokens or random tone-burst complexes.

Goossens T, van de Par S, Kohlrausch A.

This study investigated factors that influence a listeners' ability to discriminate Gaussian-noise stimuli in a same-different discrimination paradigm. The first experiment showed that discrimination ability increased with bandwidth for noise durations up to 100 ms. Duration had a nonmonotonic influence on performance, with a decrease in discriminability for stimuli longer than 40 ms. Further experiments investigated the cause for this performance decrease. They showed that discriminability could be improved when using frozen-noise tokens and by instructing listeners to focus on the stimulus endings. A final experiment, using a stimulus consisting of 5 ms Hanning-windowed tone-bursts randomly distributed over time, investigated whether stimulus duration and amount of information differently affect the processing capacity of the auditory system. Results showed that the number of degrees of freedom in the stimulus, not its duration, predominantly influenced the ability to discriminate. Overall, the results suggest that the discrimination performance for acoustic stimuli depends strongly on the amount of information per critical band and the capacity to process this information. This capacity seems to be limited in the temporal dimension, while extending the signal over more auditory filters does have a positive effect on performance.


DP2: J Acoust Soc Am. 2009 Feb;125(2):1082-90.

Continuous versus discrete frequency changes: different detection mechanisms?

Demany L, Carlyon RP, Semal C.

Sek and Moore [J. Acoust. Soc. Am. 106, 351-359 (1999)] and Lyzenga et al. [J. Acoust. Soc. Am. 116, 491-501 (2004)] found that the just-noticeable frequency difference between two pure tones relatively close in time is smaller when these tones are smoothly connected by a frequency glide than when they are separated by a silent interval. This "glide effect" was interpreted as evidence that frequency glides can be detected by a specific auditory mechanism, not involved in the detection of discrete, time-delayed frequency changes. Lyzenga et al. argued in addition that the glide-detection mechanism provides little information on the direction of frequency changes near their detection threshold. The first experiment reported here confirms the existence of the glide effect, but also shows that it disappears when the glide is not connected smoothly to the neighboring steady tones. A second experiment demonstrates that the direction of a 750 ms frequency glide can be perceptually identified as soon as the glide is detectable. These results, and some other observations, lead to a new interpretation of the glide effect, and to the conclusion that continuous frequency changes may be detected in the same manner as discrete frequency changes.


DP3: Psych Science 2008 Jan 19:85-91

Auditory Change Detection. Simple Sounds Are Not Memorized Better Than Complex Sounds.

Demany L, Trost W, Serman M, and Semal C

Previous research has shown that the detectability of a local change in a visual image is essentially independent of the complexity of the image when the interstimulus interval (ISI) is very short, but is limited by a low-capacity memory system when the ISI exceeds 100 ms. In the study reported here, listeners made same/different judgments on pairs of successive ''chords’’ (sums of pure tones with random frequencies). The change to be detected was always a frequency shift in one of the tones, and which tone would change was unpredictable. Performance worsened as the number of tones increased, but this effect was not larger for 2-s ISIs than for 0-ms ISIs. Similar results were obtained when a chord was followed by a single tone that had to be judged as higher or lower than the closest component of the chord. Overall, our data suggest that change detection is based on different mechanisms in audition and vision.


DP4: J Acoust Soc Am. 2008 Sep;124(3):1653-67.

Harmonic segregation through mistuning can improve fundamental frequency discrimination.

Bernstein JG, Oxenham AJ.

This study investigated the relationship between harmonic frequency resolution and fundamental frequency (f(0)) discrimination. Consistent with earlier studies, f(0) discrimination of a diotic bandpass-filtered harmonic complex deteriorated sharply as the f(0) decreased to the point where only harmonics above the tenth were presented. However, when the odd harmonics were mistuned by 3%, performance improved dramatically, such that performance nearly equaled that found with only even harmonics present. Mistuning also improved performance when alternating harmonics were presented to opposite ears (dichotic condition). In a task involving frequency discrimination of individual harmonics within the complexes, mistuning the odd harmonics yielded no significant improvement in the resolution of individual harmonics. Pitch matches to the mistuned complexes suggested that the even harmonics dominated the pitch for f(0)'s at which a benefit of mistuning was observed. The results suggest that f(0) discrimination performance can benefit from perceptual segregation based on inharmonicity, and that poor performance when only high-numbered harmonics are present is not due to limited peripheral harmonic resolvability. Taken together with earlier results, the findings suggest that f(0) discrimination may depend on auditory filter bandwidths, but that spectral resolution of individual harmonics is neither necessary nor sufficient for accurate f(0) discrimination.


DP5: Psychol Sci. 2008 Dec;19(12):1263-71.

Is Relative Pitch Specific to Pitch?

McDermott JH, Lehr, AJ, Oxenham AJ.

Melodies, speech, and other stimuli that vary in pitch are processed largely in terms of the relative pitch differences between sounds. Relative representations permit recognition of pitch patterns despite variations in overall pitch level between instruments or speakers. A key component of relative pitch is the sequence of pitch increases and decreases from note to note, known as the melodic contour. Here we report that contour representations are also produced by patterns in loudness and brightness (an aspect of timbre), and that contours in one dimension can be readily recognized in other dimensions, implicating similar or common representations. Most surprisingly, contours in loudness and brightness are nearly as useful as pitch contours for recognizing familiar melodies that are normally conveyed via pitch. Our results indicate that relative representations via contour extraction are a general feature of the auditory system, and may have a common central locus.


DP6: PLoS Biol. 2008 May 20;6(5):e126.

Low-level information and high-level perception: the case of speech in noise.

Nahum M, Nelken I, Ahissar M.

Auditory information is processed in a fine-to-crude hierarchical scheme, from low-level acoustic information to high-level abstract representations, such as phonological labels. We now ask whether fine acoustic information, which is not retained at high levels, can still be used to extract speech from noise. Previous theories suggested either full availability of low-level information or availability that is limited by task difficulty. We propose a third alternative, based on the Reverse Hierarchy Theory (RHT), originally derived to describe the relations between the processing hierarchy and visual perception. RHT asserts that only the higher levels of the hierarchy are immediately available for perception. Direct access to low-level information requires specific conditions, and can be achieved only at the cost of concurrent comprehension. We tested the predictions of these three views in a series of experiments in which we measured the benefits from utilizing low-level binaural information for speech perception, and compared it to that predicted from a model of the early auditory system. Only auditory RHT could account for the full pattern of the results, suggesting that similar defaults and tradeoffs underlie the relations between hierarchical processing and perception in the visual and auditory modalities.


DP7: J Exp Psychol Hum Percept Perform. 2008 Aug;34(4):1007-16.

Effects of context on auditory stream segregation.

Snyder JS, Carter OL, Lee SK, Hannon EE, Alain C.

The authors examined the effect of preceding context on auditory stream segregation. Low tones (A), high tones (B), and silences (-) were presented in an ABA- pattern. Participants indicated whether they perceived 1 or 2 streams of tones. The A tone frequency was fixed, and the B tone was the same as the A tone or had 1 of 3 higher frequencies. Perception of 2 streams in the current trial increased with greater frequency separation between the A and B tones (Delta f). Larger Delta f in previous trials modified this pattern, causing less streaming in the current trial. This occurred even when listeners were asked to bias their perception toward hearing 1 stream or 2 streams. The effect of previous Delta f was not due to response bias because simply perceiving 2 streams in the previous trial did not cause less streaming in the current trial. Finally, the effect of previous ?f was diminished, though still present, when the silent duration between trials was increased to 5.76 s. The time course of this context effect on streaming implicates the involvement of auditory sensory memory or neural adaptation.


DP8: Nat Neurosci. 2009 Jun;12(6):692-7.

On hearing with more than one ear: lessons from evolution.

Schnupp JW, Carr CE.

Although ears capable of detecting airborne sound have arisen repeatedly and independently in different species, most animals that are capable of hearing have a pair of ears. We review the advantages that arise from having two ears and discuss recent research on the similarities and differences in the binaural processing strategies adopted by birds and mammals. We also ask how these different adaptations for binaural and spatial hearing might inform and inspire the development of techniques for future auditory prosthetic devices.


DP9: Brain Res. 2008 Jul 18;1220:224-33.

Sparse gammatone signal model optimized for English speech does not match the human auditory filters.

Strahl S, Mertins A.

Evidence that neurosensory systems use sparse signal representations as well as improved performance of signal processing algorithms using sparse signal models raised interest in sparse signal coding in the last years. For natural audio signals like speech and environmental sounds, gammatone atoms have been derived as expansion functions that generate a nearly optimal sparse signal model (Smith, E., Lewicki, M., 2006. Efficient auditory coding. Nature 439, 978-982). Furthermore, gammatone functions are established models for the human auditory filters. Thus far, a practical application of a sparse gammatone signal model has been prevented by the fact that deriving the sparsest representation is, in general, computationally intractable. In this paper, we applied an accelerated version of the matching pursuit algorithm for gammatone dictionaries allowing real-time and large data set applications. We show that a sparse signal model in general has advantages in audio coding and that a sparse gammatone signal model encodes speech more efficiently in terms of sparseness than a sparse modified discrete cosine transform (MDCT) signal model. We also show that the optimal gammatone parameters derived for English speech do not match the human auditory filters, suggesting for signal processing applications to derive the parameters individually for each applied signal class instead of using psychometrically derived parameters. For brain research, it means that care should be taken with directly transferring findings of optimality for technical to biological systems.


DP10: J Neurosci. 2009 Nov 4;29(44):13797-808.

Dynamic range adaptation to sound level statistics in the auditory nerve.

Wen B, Wang GI, Dean I, Delgutte B.

The auditory system operates over a vast range of sound pressure levels (100-120 dB) with nearly constant discrimination ability across most of the range, well exceeding the dynamic range of most auditory neurons (20-40 dB). Dean et al. (2005) have reported that the dynamic range of midbrain auditory neurons adapts to the distribution of sound levels in a continuous, dynamic stimulus by shifting toward the most frequently occurring level. Here, we show that dynamic range adaptation, distinct from classic firing rate adaptation, also occurs in primary auditory neurons in anesthetized cats for tone and noise stimuli. Specifically, the range of sound levels over which firing rates of auditory nerve (AN) fibers grows rapidly with level shifts nearly linearly with the most probable levels in a dynamic sound stimulus. This dynamic range adaptation was observed for fibers with all characteristic frequencies and spontaneous discharge rates. As in the midbrain, dynamic range adaptation improved the precision of level coding by the AN fiber population for the prevailing sound levels in the stimulus. However, dynamic range adaptation in the AN was weaker than in the midbrain and not sufficient (0.25 dB/dB, on average, for broadband noise) to prevent a significant degradation of the precision of level coding by the AN population above 60 dB SPL. These findings suggest that adaptive processing of sound levels first occurs in the auditory periphery and is enhanced along the auditory pathway.


DP11: J Acoust Soc Am. 2009 Nov;126(5):2390-412.

A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics.

Zilany MS, Bruce IC, Nelson PC, Carney LH.

There is growing evidence that the dynamics of biological systems that appear to be exponential over short time courses are in some cases better described over the long-term by power-law dynamics. A model of rate adaptation at the synapse between inner hair cells and auditory-nerve (AN) fibers that includes both exponential and power-law dynamics is presented here. Exponentially adapting components with rapid and short-term time constants, which are mainly responsible for shaping onset responses, are followed by two parallel paths with power-law adaptation that provide slowly and rapidly adapting responses. The slowly adapting power-law component significantly improves predictions of the recovery of the AN response after stimulus offset. The faster power-law adaptation is necessary to account for the "additivity" of rate in response to stimuli with amplitude increments. The proposed model is capable of accurately predicting several sets of AN data, including amplitude-modulation transfer functions, long-term adaptation, forward masking, and adaptation to increments and decrements in the amplitude of an ongoing stimulus.


DP12: J Acoust Soc Am. 2009 Oct;126(4):1975-87.

Auditory stream segregation in cochlear implant listeners: measures based on temporal discrimination and interleaved melody recognition.

Cooper HR, Roberts B.

The evidence that cochlear implant listeners routinely experience stream segregation is limited and equivocal. Streaming in these listeners was explored using tone sequences matched to the center frequencies of the implant's 22 electrodes. Experiment 1 measured temporal discrimination for short (ABA triplet) and longer (12 AB cycles) sequences (tone/silence durations = 60/40 ms). Tone A stimulated electrode 11; tone B stimulated one of 14 electrodes. On each trial, one sequence remained isochronous, and tone B was delayed in the other; listeners had to identify the anisochronous interval. The delay was introduced in the second half of the longer sequences. Prior build-up of streaming should cause thresholds to rise more steeply with increasing electrode separation, but no interaction with sequence length was found. Experiment 2 required listeners to identify which of two target sequences was present when interleaved with distractors (tone/silence durations = 120/80 ms). Accuracy was high for isolated targets, but most listeners performed near chance when loudness-matched distractors were added, even when remote from the target. Only a substantial reduction in distractor level improved performance, and this effect did not interact with target-distractor separation. These results indicate that implantees often do not achieve stream segregation, even in relatively unchallenging tasks.


JME1: Proc Natl Acad Sci U S A. 2003 Feb 4;100(3):1405-8.

Suppression of cortical representation through backward conditioning.

Bao S, Chan VT, Zhang LI, Merzenich MM.

Temporal stimulus reinforcement sequences have been shown to determine the directions of synaptic plasticity and behavioral learning. Here, we examined whether they also control the direction of cortical reorganization. Pairing ventral tegmental area stimulation with a sound in a backward conditioning paradigm specifically reduced representations of the paired sound in the primary auditory cortex (AI). This temporal sequence-dependent bidirectional cortical plasticity modulated by dopamine release hypothetically serves to prevent the over-representation of frequently occurring stimuli resulting from their random pairing with unrelated rewards.


JME2: Nature. 2007 Nov 15;450(7168):425-9.

A synaptic memory trace for cortical receptive field plasticity.

Froemke RC, Merzenich MM, Schreiner CE.

Receptive fields of sensory cortical neurons are plastic, changing in response to alterations of neural activity or sensory experience. In this way, cortical representations of the sensory environment can incorporate new information about the world, depending on the relevance or value of particular stimuli. Neuromodulation is required for cortical plasticity, but it is uncertain how subcortical neuromodulatory systems, such as the cholinergic nucleus basalis, interact with and refine cortical circuits. Here we determine the dynamics of synaptic receptive field plasticity in the adult primary auditory cortex (also known as AI) using in vivo whole-cell recording. Pairing sensory stimulation with nucleus basalis activation shifted the preferred stimuli of cortical neurons by inducing a rapid reduction of synaptic inhibition within seconds, which was followed by a large increase in excitation, both specific to the paired stimulus. Although nucleus basalis was stimulated only for a few minutes, reorganization of synaptic tuning curves progressed for hours thereafter: inhibition slowly increased in an activity-dependent manner to rebalance the persistent enhancement of excitation, leading to a retuned receptive field with new preference for the paired stimulus. This restricted period of disinhibition may be a fundamental mechanism for receptive field plasticity, and could serve as a memory trace for stimuli or episodes that have acquired new behavioural significance.


JME3: Eur J Neurosci. 2006 Aug;24(3):857-66.

Neonatal nicotine exposure impairs nicotinic enhancement of central auditory processing and auditory learning in adult rats.

Liang K, Poytress BS, Chen Y, Leslie FM, Weinberger NM, Metherate R.

Children of women who smoke cigarettes during pregnancy display cognitive deficits in the auditory-verbal domain. Clinical studies have implicated developmental exposure to nicotine, the main psychoactive ingredient of tobacco, as a probable cause of subsequent auditory deficits. To test for a causal link, we have developed an animal model to determine how neonatal nicotine exposure affects adult auditory function. In adult control rats, nicotine administered systemically (0.7 mg/kg, s.c.) enhanced the sensitivity to sound of neural responses recorded in primary auditory cortex. The effect was strongest in cortical layers 3 and 4, where there is a dense concentration of nicotinic acetylcholine receptors (nAChRs) that has been hypothesized to regulate thalamocortical inputs. In support of the hypothesis, microinjection into layer 4 of the nonspecific nAChR antagonist mecamylamine (10 microM) strongly reduced sound-evoked responses. In contrast to the effects of acute nicotine and mecamylamine in adult control animals, neither drug was as effective in adult animals that had been treated with 5 days of chronic nicotine exposure (CNE) shortly after birth. Neonatal CNE also impaired performance on an auditory-cued active avoidance task, while having little effect on basic auditory or motor functions. Thus, neonatal CNE impairs nicotinic regulation of cortical function, and auditory learning, in the adult. Our results provide evidence that developmental nicotine exposure is responsible for auditory-cognitive deficits in the offspring of women who smoke during pregnancy, and suggest a potential underlying mechanism, namely diminished function of cortical nAChRs.


JME4: J Cogn Neurosci. 2008 Jan;20(1):135-52.

Linking cortical spike pattern codes to auditory perception.

Walker KM, Ahmed B, Schnupp JW.

Neurometric analysis has proven to be a powerful tool for studying links between neural activity and perception, especially in visual and somatosensory cortices, but conventional neurometrics are based on a simplistic rate-coding hypothesis that is clearly at odds with the rich and complex temporal spiking patterns evoked by many natural stimuli. In this study, we investigated the possible relationships between temporal spike pattern codes in the primary auditory cortex (A1) and the perceptual detection of subtle changes in the temporal structure of a natural sound. Using a two-alternative forced-choice oddity task, we measured the ability of human listeners to detect local time reversals in a marmoset twitter call. We also recorded responses of neurons in A1 of anesthetized and awake ferrets to these stimuli, and analyzed these responses using a novel neurometric approach that is sensitive to temporal discharge patterns. We found that although spike count-based neurometrics were inadequate to account for behavioral performance on this auditory task, neurometrics based on the temporal discharge patterns of populations of A1 units closely matched the psychometric performance curve, but only if the spiking patterns were resolved at temporal resolutions of 20 msec or better. These results demonstrate that neurometric discrimination curves can be calculated for temporal spiking patterns, and they suggest that such an extension of previous spike count-based approaches is likely to be essential for understanding the neural correlates of the perception of stimuli with a complex temporal structure.


MC1: J Neurosci. 2004 Apr 7;24(14):3637-42.

Sensitivity to auditory object features in human temporal neocortex.

Zatorre RJ, Bouffard M, Belin P.

This positron emission tomography study examined the hemodynamic response of the human brain to auditory object feature processing. A continuum of object feature variation was created by combining different numbers of stimuli drawn from a diverse sample of 45 environmental sounds. In each 60 sec scan condition, subjects heard either a distinct individual sound on each trial or simultaneous combinations of sounds that varied systematically in their similarity or distinctiveness across conditions. As more stimuli are combined they become more similar and less distinct from one another; the limiting case is when all 45 are added together to form a noise that is repeated on each trial. Analysis of covariation of cerebral blood flow elicited by this parametric manipulation revealed a response in the upper bank of the right anterior superior temporal sulcus (STS): when sounds were identical across trials (i.e., a noise made up of 45 sounds), activity was at a minimum; when stimuli were different from one another, activity was maximal. A right inferior frontal area was also revealed. The results are interpreted as reflecting sensitivity of this region of temporal neocortex to auditory object features, as predicted by neurophysiological and anatomical models implicating an anteroventral functional stream in object processing. The findings also fit with evidence that voice processing may involve regions within the anterior STS. The data are discussed in light of these models and are related to the concept that this functional stream is sensitive to invariant sound features that characterize individual auditory objects.


MC2: Neuron. 2007 Sep 20;55(6):985-96.

Cerebral responses to change in spatial location of unattended sounds.

Deouell LY, Heller AS, Malach R, D'Esposito M, Knight RT.

The neural basis of spatial processing in the auditory cortex has been controversial. Human fMRI studies suggest that a part of the planum temporale (PT) is involved in auditory spatial processing, but it was recently argued that this region is active only when the task requires voluntary spatial localization. If this is the case, then this region cannot harbor an ongoing spatial representation of the acoustic environment. In contrast, we show in three fMRI experiments that a region in the human medial PT is sensitive to background auditory spatial changes, even when subjects are not engaged in a spatial localization task, and in fact attend the visual modality. During such times, this area responded to rare location shifts, and even more so when spatial variation increased, consistent with spatially selective adaptation. Thus, acoustic space is represented in the human PT even when sound processing is not required by the ongoing task.


MC3: J Cogn Neurosci. 2007 Oct;19(10):1721-33.

Feature- and object-based attentional modulation in the human auditory where pathway.

Krumbholz K, Eickhoff SB, Fink GR.

Attending to a visual stimulus feature, such as color or motion, enhances the processing of that feature in the visual cortex. Moreover, the processing of the attended object's other, unattended, features is also enhanced. Here, we used functional magnetic resonance imaging to show that attentional modulation in the auditory system may also exhibit such feature- and object-specific effects. Specifically, we found that attending to auditory motion increases activity in nonprimary motion-sensitive areas of the auditory cortical "where" pathway. Moreover, activity in these motion-sensitive areas was also increased when attention was directed to a moving rather than a stationary sound object, even when motion was not the attended feature. An analysis of effective connectivity revealed that the motion-specific attentional modulation was brought about by an increase in connectivity between the primary auditory cortex and nonprimary motion-sensitive areas, which, in turn, may have been mediated by the paracingulate cortex in the frontal lobe. The current results indicate that auditory attention can select both objects and features. The finding of feature-based attentional modulation implies that attending to one feature of a sound object does not necessarily entail an exhaustive processing of the object's unattended features.


MC4: PLoS Biol. 2008 Jun 10;6(6):e138.

Gutschalk A, Micheyl C, Oxenham AJ.

Neural correlates of auditory perceptual awareness under informational masking.

Our ability to detect target sounds in complex acoustic backgrounds is often limited not by the ear's resolution, but by the brain's information-processing capacity. The neural mechanisms and loci of this "informational masking" are unknown. We combined magnetoencephalography with simultaneous behavioral measures in humans to investigate neural correlates of informational masking and auditory perceptual awareness in the auditory cortex. Cortical responses were sorted according to whether or not target sounds were detected by the listener in a complex, randomly varying multi-tone background known to produce informational masking. Detected target sounds elicited a prominent, long-latency response (50-250 ms), whereas undetected targets did not. In contrast, both detected and undetected targets produced equally robust auditory middle-latency, steady-state responses, presumably from the primary auditory cortex. These findings indicate that neural correlates of auditory awareness in informational masking emerge between early and late stages of processing within the auditory cortex.


MC5: PLoS Biol. 2007 Oct 23;5(11):e288.

An information theoretic characterisation of auditory encoding.

Overath T, Cusack R, Kumar S, von Kriegstein K, Warren JD, Grube M, Carlyon RP, Griffiths TD.

The entropy metric derived from information theory provides a means to quantify the amount of information transmitted in acoustic streams like speech or music. By systematically varying the entropy of pitch sequences, we sought brain areas where neural activity and energetic demands increase as a function of entropy. Such a relationship is predicted to occur in an efficient encoding mechanism that uses less computational resource when less information is present in the signal: we specifically tested the hypothesis that such a relationship is present in the planum temporale (PT). In two convergent functional MRI studies, we demonstrated this relationship in PT for encoding, while furthermore showing that a distributed fronto-parietal network for retrieval of acoustic information is independent of entropy. The results establish PT as an efficient neural engine that demands less computational resource to encode redundant signals than those with high information content.


MC6: J Neurosci. 2009 Jun 17;29(24):7686-93.

Abnormal cortical processing of the syllable rate of speech in poor readers.

Abrams DA, Nicol T, Zecker S, Kraus N.

Children with reading impairments have long been associated with impaired perception for rapidly presented acoustic stimuli and recently have shown deficits for slower features. It is not known whether impairments for low-frequency acoustic features negatively impact processing of speech in reading-impaired individuals. Here we provide neurophysiological evidence that poor readers have impaired representation of the speech envelope, the acoustical cue that provides syllable pattern information in speech. We measured cortical-evoked potentials in response to sentence stimuli and found that good readers indicated consistent right-hemisphere dominance in auditory cortex for all measures of speech envelope representation, including the precision, timing, and magnitude of cortical responses. Poor readers showed abnormal patterns of cerebral asymmetry for all measures of speech envelope representation. Moreover, cortical measures of speech envelope representation predicted up to 41% of the variability in standardized reading scores and 50% in measures of phonological processing across a wide range of abilities. Our findings strongly support a relationship between acoustic-level processing and higher-level language abilities, and are the first to link reading ability with cortical processing of low-frequency acoustic features in the speech signal. Our results also support the hypothesis that asymmetric routing between cerebral hemispheres represents an important mechanism for temporal encoding in the human auditory system, and the need for an expansion of the temporal processing hypothesis for reading disabilities to encompass impairments for a wider range of speech features than previously acknowledged.


MC7: J Neurosci. 2009 Jul 1;29(26):8447-51.

I heard that coming: event-related potential evidence for stimulus-driven prediction in the auditory system.

Bendixen A, Schröger E, Winkler I.

The auditory system has been shown to detect predictability in a tone sequence, but does it use the extracted regularities for actually predicting the continuation of the sequence? The present study sought to find evidence for the generation of such predictions. Predictability was manipulated in an isochronous series of tones in which every other tone was a repetition of its predecessor. The existence of predictions was probed by occasionally omitting either the first (unpredictable) or the second (predictable) tone of a same-frequency tone pair. Event-related electrical brain activity elicited by the omission of an unpredictable tone differed from the response to the actual tone right from the tone onset. In contrast, early electrical brain activity elicited by the omission of a predictable tone was quite similar to the response to the actual tone. This suggests that the auditory system preactivates the neural circuits for expected input, using sequential predictions to specifically prepare for future acoustic events.


MC8: Cereb Cortex. 2007 Nov;17(11):2544-52.

Working memory specific activity in auditory cortex: potential correlates of sequential processing and maintenance.

Brechmann A, Gaschler-Markefski B, Sohr M, Yoneda K, Kaulisch T, Scheich H.

Working memory (WM) tasks involve several interrelated processes during which past information must be transiently maintained, recalled, and compared with test items according to previously instructed rules. It is not clear whether the rule-specific comparisons of perceptual with memorized items are only performed in previously identified frontal and parietal WM areas or whether these areas orchestrate such comparisons by feedback to sensory cortex. We tested the latter hypothesis by focusing on auditory cortex (AC) areas with low-noise functional magnetic resonance imaging in a 2-back WM task involving frequency-modulated (FM) tones. The control condition was a 0-back task on the same stimuli. Analysis of the group data identified an area on right planum temporale equally activated by both tasks and an area on the left planum temporale specifically involved in the 2-back task. A region of interest analysis in each individual revealed that activation on the left planum temporale in the 2-back task positively correlated with the task performance of the subjects. This strongly suggests a prominent role of the AC in 2-back WM tasks. In conjunction with previous findings on FM processing, the left lateralized effect presumably reflects the complex sequential processing demand of the 2-back matching to sample task.