[P2 evaluation] Articles

Choisir deux articles dans la liste, provenant de deux intervenants différents. N'oubliez pas de me préciser lequel est pour l'oral et lequel est pour l'écrit. Vu que Barbara Tillmann ne pourra pas assister à l'oral, il serait préférable de choisir ses articles pour l'écrit.


AdC1 : Nat Neurosci. 2006 Nov;9(11):1446-8. Epub 2006 Oct 8.

Discrimination learning induced by training with identical stimuli.

Amitay S, Irwin A, Moore DR.

Sensory stimuli become easier to detect or distinguish with practice. It is generally assumed that the task-relevant stimulus dimension becomes increasingly more salient as a result of attentively performing the task at a level that is neither too easy nor too difficult. However, here we show improved auditory frequency discrimination following training with physically identical tones that were impossible to discriminate. We also show that learning transfers across tone frequencies and across modalities: training on a silent visuospatial computer game improved thresholds on the auditory discrimination task. We suggest that three processes are necessary for optimal perceptual learning: sensitization through exposure to the stimulus, modality- and dimension-specific attention, and general arousal.


AdC2 : Pitch comparisons of acoustically and electrically evoked auditory sensations.

Pitch comparisons of acoustically and electrically evoked auditory sensations.

Blamey PJ, Dooley GJ, Parisi ES, Clark GM.

Cochlear implant users with some residual hearing in the non-implanted ear compared the pitch sensations produced by acoustic pure tones and pulsatile electric stimuli. Pitch comparisons were obtained for pure tones and electrical stimuli presented at different positions (electrodes) in the scala tympani, keeping the electric pulse rate fixed at 100, 250, or 800 pps. Similarly, pitch comparisons were obtained for electrical stimuli with variable pulse rates presented to two fixed electrode positions (apical and basal) in the cochlea. Both electrode position and pulse rate influenced the perceived pitch of the electrical signal and 'matched' electric and acoustic signals were found over a wide range of frequencies. There was a large variation between listeners. For some stimuli, listeners had difficulty in deciding whether the acoustic or electric stimulus was higher in pitch. Despite the variability, consistent trends were obtained from the data: higher frequencies tended to be matched by more basal electrodes for all pulse rates. Higher frequencies tended to be matched by higher pulse rates for both electrode positions. The electrode positions that 'matched' pure tones were more basal than predicted from the characteristic frequency coordinates of the basilar membrane in a normal human cochlea.


AdC3 : Neuron. 2005 Nov 3;48(3):479-88.

Reliability and representational bandwidth in the auditory cortex.

DeWeese MR, Hromádka T, Zador AM.

It is unclear why there are so many more neurons in sensory cortex than in the sensory periphery. One possibility is that these "extra" neurons are used to overcome cortical noise and faithfully represent the acoustic stimulus. Another possibility is that even after overcoming cortical noise, there is "excess representational bandwidth" available and that this bandwidth is used to represent conjunctions of auditory and nonauditory information for computation. Here, we discuss recent data about neuronal reliability in auditory cortex showing that cortical noise may not be as high as was previously believed. Although at present, the data suggest that auditory cortex neurons can be more reliable than those in the visual cortex, we speculate that the principles governing cortical computation are universal and that visual and other cortical areas can also exploit strategies based on similarly high-fidelity activity.


AdC4 : Nat Neurosci. 1999 Oct;2(10):863-5.

A contingent aftereffect in the auditory system.

Dong C, Swindale NV, Cynader MS.

Pairs of stimulus attributes, such as color and orientation, that are normally uncorrelated in the real world are generally perceived independently; that is, the perception of color is usually uninfluenced by orientation and vice versa. Yet this independence can be altered by relatively brief exposure to artificially correlated stimuli, as has been shown for vision1. Here we report an analogous contingent aftereffect in the auditory system that can persist for four hours after the initial adaptation.


AdC5 : Nat Neurosci. 2007 Jul;10(7):915-21. Epub 2007 Jun 24.

Amusia is associated with deficits in spatial processing.

Douglas KM, Bilkey DK.

Amusia (commonly referred to as tone-deafness) is a difficulty in discriminating pitch changes in melodies that affects around 4% of the human population. Amusia cannot be explained as a simple sensory impairment. Here we show that amusia is strongly related to a deficit in spatial processing in adults. Compared to two matched control groups (musicians and non-musicians), participants in the amusic group were significantly impaired on a visually presented mental rotation task. Amusic subjects were also less prone to interference in a spatial stimulus-response incompatibility task and performed significantly faster than controls in an interference task in which they were required to make simple pitch discriminations while concurrently performing a mental rotation task. This indicates that the processing of pitch in music normally depends on the cognitive mechanisms that are used to process spatial representations in other modalities.


AdC6 : Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):11793-9.

Subdivisions of auditory cortex and processing streams in primates.

Kaas JH, Hackett TA.

The auditory system of monkeys includes a large number of interconnected subcortical nuclei and cortical areas. At subcortical levels, the structural components of the auditory system of monkeys resemble those of nonprimates, but the organization at cortical levels is different. In monkeys, the ventral nucleus of the medial geniculate complex projects in parallel to a core of three primary-like auditory areas, AI, R, and RT, constituting the first stage of cortical processing. These areas interconnect and project to the homotopic and other locations in the opposite cerebral hemisphere and to a surrounding array of eight proposed belt areas as a second stage of cortical processing. The belt areas in turn project in overlapping patterns to a lateral parabelt region with at least rostral and caudal subdivisions as a third stage of cortical processing. The divisions of the parabelt distribute to adjoining auditory and multimodal regions of the temporal lobe and to four functionally distinct regions of the frontal lobe. Histochemically, chimpanzees and humans have an auditory core that closely resembles that of monkeys. The challenge for future researchers is to understand how this complex system in monkeys analyzes and utilizes auditory information.


AdC7 : J Acoust Soc Am. 2003 Sep;114(3):1543-9.

Informational masking and musical training.

Oxenham AJ, Fligor BJ, Mason CR, Kidd G Jr.

The relationship between musical training and informational masking was studied for 24 young adult listeners with normal hearing. The listeners were divided into two groups based on musical training. In one group, the listeners had little or no musical training; the other group was comprised of highly trained, currently active musicians. The hypothesis was that musicians may be less susceptible to informational masking, which is thought to reflect central, rather than peripheral, limitations on the processing of sound. Masked thresholds were measured in two conditions, similar to those used by Kidd et al. [J. Acoust. Soc. Am. 95, 3475-3480 (1994)]. In both conditions the signal was comprised of a series of repeated tone bursts at 1 kHz. The masker was comprised of a series of multitone bursts, gated with the signal. In one condition the frequencies of the masker were selected randomly for each burst; in the other condition the masker frequencies were selected randomly for the first burst of each interval and then remained constant throughout the interval. The difference in thresholds between the two conditions was taken as a measure of informational masking. Frequency selectivity, using the notched-noise method, was also estimated in the two groups. The results showed no difference in frequency selectivity between the two groups, but showed a large and significant difference in the amount of informational masking between musically trained and untrained listeners. This informational masking task, which requires no knowledge specific to musical training (such as note or interval names) and is generally not susceptible to systematic short- or medium-term training effects, may provide a basis for further studies of analytic listening abilities in different populations.


BT1 : Proc Natl Acad Sci U S A. 2005 Aug 30;102(35):12639-43.

Tuning in to musical rhythms: infants learn more readily than adults.

Hannon EE, Trehub SE.

Domain-general tuning processes may guide the acquisition of perceptual knowledge in infancy. Here, we demonstrate that 12-month-old infants show an adult-like, culture-specific pattern of responding to musical rhythms, in contrast to the culture-general responding that is evident at 6 months of age. Nevertheless, brief exposure to foreign music enables 12-month-olds, but not adults, to perceive rhythmic distinctions in foreign musical contexts. These findings may indicate a sensitive period early in life for acquiring rhythm in particular or socially and biologically important structures more generally.


BT2 : J Cogn Neurosci. 2005 Oct;17(10):1565-77.

Interaction between syntax processing in language and in music: an ERP Study.

Koelsch S, Gunter TC, Wittfoth M, Sammler D.

The present study investigated simultaneous processing of language and music using visually presented sentences and auditorily presented chord sequences. Music-syntactically regular and irregular chord functions were presented synchronously with syntactically correct or incorrect words, or with words that had either a high or a low semantic cloze probability. Music-syntactically irregular chords elicited an early right anterior negativity (ERAN). Syntactically incorrect words elicited a left anterior negativity (LAN). The LAN was clearly reduced when words were presented simultaneously with music-syntactically irregular chord functions. Processing of high and low cloze-probability words as indexed by the N400 was not affected by the presentation of irregular chord functions. In a control experiment, the LAN was not affected by physically deviant tones that elicited a mismatch negativity (MMN). Results demonstrate that processing of musical syntax (as reflected in the ERAN) interacts with the processing of linguistic syntax (as reflected in the LAN), and that this interaction is not due to a general effect of deviance-related negativities that precede an LAN. Findings thus indicate a strong overlap of neural resources involved in the processing of syntax in language and music.


BT3 : Cognition. 2000 Jul 14;76(1):13-58.

Cross-cultural music cognition: cognitive methodology applied to North Sami yoiks.

Krumhansl CL, Toivanen P, Eerola T, Toiviainen P, Jarvinen T, Louhivuori J.

This article is a study of melodic expectancy in North Sami yoiks, a style of music quite distinct from Western tonal music. Three different approaches were taken. The first approach was a statistical style analysis of tones in a representative corpus of 18 yoiks. The analysis determined the relative frequencies of tone onsets and two- and three-tone transitions. It also identified style characteristics, such as pentatonic orientation, the presence of two reference pitches, the frequency of large consonant intervals, and a relatively large set of possible melodic continuations. The second approach was a behavioral experiment in which listeners made judgments about melodic continuations. Three groups of listeners participated. One group was from the Sami culture, the second group consisted of Finnish music students who had learned some yoiks, and the third group consisted of Western musicians unfamiliar with yoiks. Expertise was associated with stronger veridical expectations (for the correct next tone) than schematic expectations (based on general style characteristics). Familiarity with the particular yoiks was found to compensate for lack of experience with the musical culture. The third approach simulated melodic expectancy with neural network models of the self-organizing map (SOM) type (Kohonen, T. (1997). Self-organizing maps (2nd ed.). Berlin: Springer). One model was trained on the excerpts of yoiks used in the behavioral experiment including the correct continuation tone, while another was trained with a set of Finnish folk songs and Lutheran hymns. The convergence of the three approaches showed that both listeners and the SOM model are influenced by the statistical distributions of tones and tone sequences. The listeners and SOM models also provided evidence supporting a core set of psychological principles underlying melody formation whose relative weights appear to differ across musical styles.


BT4 : Psychophysiology. 2004 May;41(3):341-9.

The music of speech: music training facilitates pitch processing in both music and language.

Schon D, Magne C, Besson M.

The main aim of the present experiment was to determine whether extensive musical training facilitates pitch contour processing not only in music but also in language. We used a parametric manipulation of final notes' or words' fundamental frequency (F0), and we recorded behavioral and electrophysiological data to examine the precise time course of pitch processing. We compared professional musicians and nonmusicians. Results revealed that within both domains, musicians detected weak F0 manipulations better than nonmusicians. Moreover, F0 manipulations within both music and language elicited similar variations in brain electrical potentials, with overall shorter onset latency for musicians than for nonmusicians. Finally, the scalp distribution of an early negativity in the linguistic task varied with musical expertise, being largest over temporal sites bilaterally for musicians and largest centrally and over left temporal sites for nonmusicians. These results are taken as evidence that extensive musical training influences the perception of pitch contour in spoken language.


CL1 : Eur J Neurosci. 2006 Oct;24(7):2003-10.

The effects of intense sound exposure on phase locking in the chick (Gallus domesticus) cochlear nerve.

Furman AC, Avissar M, Saunders JC.

Little is known about changes that occur to phase locking in the auditory nerve following exposure to intense and damaging levels of sound. The present study evaluated synchronization in the discharge patterns of cochlear nerve units collected from two groups of young chicks (Gallus domesticus), one shortly after removal from an exposure to a 120-dB, 900-Hz pure tone for 48 h and the other from a group of non-exposed control animals. Spontaneous activity, the characteristic frequency (CF), CF threshold and a phaselocked peri-stimulus time histogram were obtained for every unit in each group. Vector strength and temporal dispersion were calculated from these peri-stimulus time histograms, and plotted against the unit’s CF. All parameters of unit responses were then compared between control and exposed units. The results in exposed units revealed that CF thresholds were elevated by 30-35 dB whereas spontaneous activity declined by 24%. In both control and exposed units a high degree of synchronization was observed in the low frequencies. The level of synchronization above approximately 0.5 kHz then systematically declined. The vector strengths in units recorded shortly after removal from the exposure were identical to those seen in control chicks. The deterioration in discharge activity of exposed units, seen in CF threshold and spontaneous activity, contrasted with the total absence of any overstimulation effect on synchronization. This suggested that synchronization arises from mechanisms unscathed by the acoustic trauma induced by the exposure.


CL2 : J Acoust Soc Am. 2003 Feb;113(2):961-8.

Understanding speech in modulated interference: cochlear implant users and normal-hearing listeners.

Nelson PB, Jin SH, Carney AE, Nelson DA.

Many competing noises in real environments are modulated or fluctuating in level. Listeners with normal hearing are able to take advantage of temporal gaps in fluctuating maskers. Listeners with sensorineural hearing loss show less benefit from modulated maskers. Cochlear implant users may be more adversely affected by modulated maskers because of their limited spectral resolution and by their reliance on envelope-based signal-processing strategies of implant processors. The current study evaluated cochlear implant users' ability to understand sentences in the presence of modulated speech-shaped noise. Normal-hearing listeners served as a comparison group. Listeners repeated IEEE sentences in quiet, steady noise, and modulated noise maskers. Maskers were presented at varying signal-to-noise ratios (SNRs) at six modulation rates varying from 1 to 32 Hz. Results suggested that normal-hearing listeners obtain significant release from masking from modulated maskers, especially at 8-Hz masker modulation frequency. In contrast, cochlear implant users experience very little release from masking from modulated maskers. The data suggest, in fact, that they may show negative effects of modulated maskers at syllabic modulation rates (2-4 Hz). Similar patterns of results were obtained from implant listeners using three different devices with different speech-processor strategies. The lack of release from masking occurs in implant listeners independent of their device characteristics, and may be attributable to the nature of implant processing strategies and/or the lack of spectral detail in processed stimuli.


CL3 : J Acoust Soc Am. 2005 Oct;118(4):2519-26.

Consequences of cochlear damage for the detection of interaural phase differences.

Lacher-Fougere S, Demany L.

Thresholds for detecting interaural phase differences (IPDs) in sinusoidally amplitude-modulated pure tones were measured in seven normal-hearing listeners and nine listeners with bilaterally symmetric hearing losses of cochlear origin. The IPDs were imposed either on the carrier signal alone-not the amplitude modulation-or vice versa. The carrier frequency was 250, 500, or 1000 Hz, the modulation frequency 20 or 50 Hz, and the sound pressure level was fixed at 75 dB. A three-interval two-alternative forced choice paradigm was used. For each type of IPD (carrier or modulation), thresholds were on average higher for the hearing-impaired than for the normal listeners. However, the impaired listeners' detection deficit was markedly larger for carrier IPDs than for modulation IPDs. This was not predictable from the effect of hearing loss on the sensation level of the stimuli since, for normal listeners, large reductions of sensation level appeared to be more deleterious to the detection of modulation IPDs than to the detection of carrier IPDs. The results support the idea that one consequence of cochlear damage is a deterioration in the perceptual sensitivity to the temporal fine structure of sounds.


CL4 : J Acoust Soc Am. 1998 Jan;103(1):577-87.

Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people.

Peters RW, Moore BC, Baer T.

People with cochlear hearing loss often have considerable difficulty in understanding speech in the presence of background sounds. In this paper the relative importance of spectral and temporal dips in the background sounds is quantified by varying the degree to which they contain such dips. Speech reception thresholds in a 65-dB SPL noise were measured for four groups of subjects: (a) young with normal hearing; (b) elderly with near-normal hearing; (c) young with moderate to severe cochlear hearing loss; and (d) elderly with moderate to severe cochlear hearing loss. The results indicate that both spectral and temporal dips are important. In a background that contained both spectral and temporal dips, groups (c) and (d) performed much more poorly than group (a). The signal-to-background ratio required for 50% intelligibility was about 19 dB higher for group (d) than for group (a). Young hearing-impaired subjects showed a slightly smaller deficit, but still a substantial one. Linear amplification combined with appropriate frequency-response shaping (NAL amplification), as would be provided by a well-fitted "conventional" hearing aid, only partially compensated for these deficits. For example, group (d) still required a speech-to-background ratio that was 15 dB higher than for group (a). Calculations of the articulation index indicated that NAL amplification did not restore audibility of the whole of the speech spectrum when the speech-to-background ratio was low. For unamplified stimuli, the SRTs in background sounds were highly correlated with absolute thresholds, but not with age. For stimuli with NAL amplification, the correlations of SRTs with absolute thresholds were lower, but SRTs in backgrounds with spectral and/or temporal dips were significantly correlated with age. It is proposed that noise with spectral and temporal dips may be especially useful in evaluating possible benefits of multi-channel compression.


CL5 : Proc Natl Acad Sci U S A. 2005 Feb 15;102(7):2293-8.

Speech recognition with amplitude and frequency modulations.

Zeng FG, Nie K, Stickney GS, Kong YY, Vongphoe M, Bhargave A, Wei C, Cao K.

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.


CL6 : Ear Hear. 2004 Jun;25(3):242-50.

Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss.

Buss E, Hall JW 3rd, Grose JH.

OBJECTIVE: The purpose of this study was to examine the effect of sensorineural hearing loss on the ability to make use of fine temporal information and to evaluate the relation between this ability and the ability to recognize speech. DESIGN: Fourteen observers with normal hearing and 12 observers with sensorineural hearing loss were tested on open-set word recognition and on psychophysical tasks thought to reflect use of fine-structure cues: the detection of 2 Hz frequency modulation (FM) and the discrimination of the rate of amplitude modulation (AM) and quasifrequency modulation (QFM). RESULTS: The results showed relatively poor performance for observers with sensorineural hearing loss on both the speech recognition and psychoacoustical tasks. Of particular interest was the finding of significant correlations within the hearing-loss group between speech recognition performance and the psychoacoustical tasks based on frequency modulation, which are thought to reflect the quality of the coding of temporal fine structure. CONCLUSIONS: These results suggest that sensorineural hearing loss may be associated with a reduced ability to use fine temporal information that is coded by neural phase-locking to stimulus fine-structure and that this may contribute to poor speech recognition performance and to poor performance on psychoacoustical tasks that depend on temporal fine structure. Copyright 2004 Lippincott Williams and Wilkins


DP1 : J Acoust Soc Am Oct 1995 98:1858-65

The stimulus duration required to identify vowels, their octave, and their pitch chroma

Robinson K, Patterson RD

Computational models of sound segregation typically include the assumption that pitch plays a key role in timbre identification. This hypothesis was investigated by presenting listeners with short segments of static vowel sounds and asking them to identify the vowel quality (timbre), the octave (tone height), or the note (tone chroma) of the sound. There were four vowel categories (/a/, /i/, /u/, and /eh/), four octave categories (centered on C1, C2, C3, and C4) and four note categories (C, D, E, and F), and performance was measured as a function of the number of glottal periods of the vowel sound. The results show that at all stimulus durations, it was easiest to identify the vowel quality (mean 94% correct), followed by the octave (71%), and finally the note (52%). The results indicate that timbre can be extracted reliably from segments of vowels that are too short to support equivalent pitch judgments, be they note identification, or the less precise judgment of the octave of the sound. Thus it is unlikely that pitch plays a key role in timbre extraction, at least at short durations.


DP2 : J Acoust Soc Am. 1994 Jun;95(6):3529-40.

The role of resolved and unresolved harmonics in pitch perception and frequency modulation discrimination.

Shackleton TM, Carlyon RP.

A series of experiments investigated the influence of harmonic resolvability on the pitch of, and the discriminability of differences in fundamental frequency (F0) between, frequency-modulated (FM) harmonic complexes. Both F0 (62.5 to 250 Hz) and spectral region (LOW: 125-625 Hz, MID: 1375-1875 Hz, and HIGH: 3900-5400 Hz) were varied orthogonally. The harmonics that comprised each complex could be summed in either sine (0 degree) phase (SINE) or alternating sine-cosine (0 degree-90 degrees) phase (ALT). Stimuli were presented in a continuous pink-noise background. Pitch-matching experiments revealed that the pitch of ALT-phase stimuli, relative to SINE-phase stimuli, was increased by an octave in the HIGH region, for all F0's, but was the same as that of SINE-phase stimuli when presented in the LOW region. In the MID region, the pitch of ALT-phase relative to SINE-phase stimuli depended on F0, being an octave higher at low F0's, equal at high F0's, and unclear at intermediate F0's. The same stimuli were then used in three measures of discriminability: FM detection thresholds (FMTs), frequency difference limens (FDLs), and FM direction discrimination thresholds (FMDDTs, defined as the minimum FM depth necessary for listeners to discriminate between two complexes modulated 180 degrees out of phase with each other). For all three measures, at all F0's, thresholds were low (< 4% for FMTs, < 5% for FMDDTs, and < 1.5% for FDLs) when stimuli were presented in the LOW region, and high (> 10% for FMTs, > 7% for FMDDTs, and > 2.5% for FDLs) when presented in the HIGH region. When stimuli were presented in the MID region, thresholds were low for low F0's, and high for high F0's. Performance was not markedly affected by the phase relationship between the components of a complex, except for stimuli with intermediate F0's in the MID spectral region, where FDLs and FMDDTs were much higher for ALT-phase stimuli than for SINE-phase stimuli, consistent with their unclear pitch. This difference was much smaller when FMTs were measured. The interaction between F0 and spectral region for both sets of experiments can be accounted for by a single definition of resolvability.


DP3 : J Exp Psychol Hum Percept Perform. 2007 Jun;33(3):743-51.

Tone sequences with conflicting fundamental pitch and timbre changes are heard differently by musicians and nonmusicians.

Seither-Preisler A, Johnson L, Krumbholz K, Nobbe A, Patterson R, Seither S, Lütkenhöner B.

An Auditory Ambiguity Test (AAT) was taken twice by nonmusicians, musical amateurs, and professional musicians. The AAT comprised different tone pairs, presented in both within-pair orders, in which overtone spectra rising in pitch were associated with missing fundamental frequencies (F0) falling in pitch, and vice versa. The F0 interval ranged from 2 to 9 semitones. The participants were instructed to decide whether the perceived pitch went up or down; no information was provided on the ambiguity of the stimuli. The majority of professionals classified the pitch changes according to F0, even at the smallest interval. By contrast, most nonmusicians classified according to the overtone spectra, except in the case of the largest interval. Amateurs ranged in between. A plausible explanation for the systematic group differences is that musical practice systematically shifted the perceptual focus from spectral toward missing-F0 pitch, although alternative explanations such as different genetic dispositions of musicians and nonmusicians cannot be ruled out. ((c) 2007 APA, all rights reserved).


DP4 : .Ear Hear. 2004 Apr;25(2):173-85.

Music perception with temporal cues in acoustic and electric hearing.

Kong YY, Cruz R, Jones JA, Zeng FG.

OBJECTIVE: The first specific aim of the present study is to compare the ability of normal-hearing and cochlear implant listeners to use temporal cues in three music perception tasks: tempo discrimination, rhythmic pattern identification, and melody identification. The second aim is to identify the relative contribution of temporal and spectral cues to melody recognition in acoustic and electric hearing. DESIGN: Both normal-hearing and cochlear implant listeners participated in the experiments. Tempo discrimination was measured in a two-interval forced-choice procedure in which subjects were asked to choose the faster tempo at four standard tempo conditions (60, 80, 100, and 120 beats per minute). For rhythmic pattern identification, seven different rhythmic patterns were created and subjects were asked to read and choose the musical notation displayed on the screen that corresponded to the rhythmic pattern presented. Melody identification was evaluated with two sets of 12 familiar melodies. One set contained both rhythm and melody information (rhythm condition), whereas the other set contained only melody information (no-rhythm condition). Melody stimuli were also processed to extract the slowly varying temporal envelope from 1, 2, 4, 8, 16, 32, and 64 frequency bands, to create cochlear implant simulations. Subjects listened to a melody and had to respond by choosing one of the 12 names corresponding to the melodies displayed on a computer screen. RESULTS: In tempo discrimination, the cochlear implant listeners performed similarly to the normal-hearing listeners with rate discrimination difference limens obtained at 4-6 beats per minute. In rhythmic pattern identification, the cochlear implant listeners performed 5-25 percentage points poorer than the normal-hearing listeners. The normal-hearing listeners achieved perfect scores in melody identification with and without the rhythmic cues. However, the cochlear implant listeners performed significantly poorer than the normal-hearing listeners in both rhythm and no-rhythm conditions. The simulation results from normal-hearing listeners showed a relatively high level of performance for all numbers of frequency bands in the rhythm condition but required as many as 32 bands in the no-rhythm condition. CONCLUSIONS: Cochlear-implant listeners performed normally in tempo discrimination, but significantly poorer than normal-hearing listeners in rhythmic pattern identification and melody recognition. While both temporal (rhythmic) and spectral (pitch) cues contribute to melody recognition, cochlear-implant listeners mostly relied on the rhythmic cues for melody recognition. Without the rhythmic cues, high spectral resolution with as many as 32 bands was needed for melody recognition for normal-hearing listeners. This result indicates that the present cochlear implants provide sufficient spectral cues to support speech recognition in quiet, but they are not adequate to support music perception. Increasing the number of functional channels and improved encoding of the fine structure information are necessary to improve music perception for cochlear implant listeners.


DP5 : : J Physiol Paris. 2006 Jul-Sep;100(1-3):154-70. Epub 2006 Nov 3.

The role of predictive models in the formation of auditory streams.

Denham SL, Winkler I.

Sounds provide us with useful information about our environment which complements that provided by other senses, but also poses specific processing problems. How does the auditory system distentangle sounds from different sound sources? And what is it that allows intermittent sound events from the same source to be associated with each other? Here we review findings from a wide range of studies using the auditory streaming paradigm in order to formulate a unified account of the processes underlying auditory perceptual organization. We present new computational modelling results which replicate responses in primary auditory cortex [Fishman, Y.I., Arezzo, J.C., Steinschneider, M., 2004. Auditory stream segregation in monkey auditory cortex: effects of frequency separation, presentation rate, and tone duration. J. Acoust. Soc. Am. 116, 1656-1670; Fishman, Y. I., Reser, D. H., Arezzo, J.C., Steinschneider, M., 2001. Neural correlates of auditory stream segregation in primary auditory cortex of the awake monkey. Hear. Res. 151, 167-187] to tone sequences. We also present the results of a perceptual experiment which confirm the bi-stable nature of auditory streaming, and the proposal that the gradual build-up of streaming may be an artefact of averaging across many subjects [Pressnitzer, D., Hupé, J. M., 2006. Temporal dynamics of auditory and visual bi-stability reveal common principles of perceptual organization. Curr. Biol. 16(13), 1351-1357.]. Finally we argue that in order to account for all of the experimental findings, computational models of auditory stream segregation require four basic processing elements; segregation, predictive modelling, competition and adaptation, and that it is the formation of effective predictive models which allows the system to keep track of different sound sources in a complex auditory environment.


DP6 : Psych Science 2008 Jan 19:85-91

Auditory Change Detection. Simple Sounds Are Not Memorized Better Than Complex Sounds.

Demany L, Trost W, Serman M, and Semal C

ABSTRACT - Previous research has shown that the detectability of a local change in a visual image is essentially independent of the complexity of the image when the interstimulus interval (ISI) is very short, but is limited by a low-capacity memory system when the ISI exceeds 100 ms. In the study reported here, listeners made same/different judgments on pairs of successive ''chords’’ (sums of pure tones with random frequencies). The change to be detected was always a frequency shift in one of the tones, and which tone would change was unpredictable. Performance worsened as the number of tones increased, but this effect was not larger for 2-s ISIs than for 0-ms ISIs. Similar results were obtained when a chord was followed by a single tone that had to be judged as higher or lower than the closest component of the chord. Overall, our data suggest that change detection is based on different mechanisms in audition and vision.


DP7 : J Cogn Neurosci. 2007 Oct 5; [Epub ahead of print]

Linking Cortical Spike Pattern Codes to Auditory Perception.

Walker KM, Ahmed B, Schnupp JW.

Abstract Neurometric analysis has proven to be a powerful tool for studying links between neural activity and perception, especially in visual and somatosensory cortices, but conventional neurometrics are based on a simplistic rate-coding hypothesis that is clearly at odds with the rich and complex temporal spiking patterns evoked by many natural stimuli. In this study, we investigated the possible relationships between temporal spike pattern codes in the primary auditory cortex (A1) and the perceptual detection of subtle changes in the temporal structure of a natural sound. Using a two-alternative forced-choice oddity task, we measured the ability of human listeners to detect local time reversals in a marmoset twitter call. We also recorded responses of neurons in A1 of anesthetized and awake ferrets to these stimuli, and analyzed these responses using a novel neurometric approach that is sensitive to temporal discharge patterns. We found that although spike count-based neurometrics were inadequate to account for behavioral performance on this auditory task, neurometrics based on the temporal discharge patterns of populations of A1 units closely matched the psychometric performance curve, but only if the spiking patterns were resolved at temporal resolutions of 20 msec or better. These results demonstrate that neurometric discrimination curves can be calculated for temporal spiking patterns, and they suggest that such an extension of previous spike count-based approaches is likely to be essential for understanding the neural correlates of the perception of stimuli with a complex temporal structure.


JME1 : Proc Natl Acad Sci U S A. 2003 Feb 4;100(3):1405-8.

Suppression of cortical representation through backward conditioning.

Bao S, Chan VT, Zhang LI, Merzenich MM.

Temporal stimulus reinforcement sequences have been shown to determine the directions of synaptic plasticity and behavioral learning. Here, we examined whether they also control the direction of cortical reorganization. Pairing ventral tegmental area stimulation with a sound in a backward conditioning paradigm specifically reduced representations of the paired sound in the primary auditory cortex (AI). This temporal sequence-dependent bidirectional cortical plasticity modulated by dopamine release hypothetically serves to prevent the over-representation of frequently occurring stimuli resulting from their random pairing with unrelated rewards.


JME2 : J Neurosci. 2005 Mar 9;25(10):2490-503.

Plasticity in primary auditory cortex of monkeys with altered vocal production.

Cheung SW, Nagarajan SS, Schreiner CE, Bedenbaugh PH, Wong A.

Response properties of primary auditory cortical neurons in the adult common marmoset monkey (Callithrix jacchus) were modified by extensive exposure to altered vocalizations that were self-generated and rehearsed frequently. A laryngeal apparatus modification procedure permanently lowered the frequency content of the native twitter call, a complex communication vocalization consisting of a series of frequency modulation (FM) sweeps. Monkeys vocalized shortly after this procedure and maintained voicing efforts until physiological evaluation 5-15 months later. The altered twitter calls improved over time, with FM sweeps approaching but never reaching the normal spectral range. Neurons with characteristic frequencies <4.3 kHz that had been weakly activated by native twitter calls were recruited to encode self-uttered altered twitter vocalizations. These neurons showed a decrease in response magnitude and an increase in temporal dispersion of response timing to twitter call and parametric FM stimuli but a normal response profile to pure tone stimuli. Tonotopic maps in voice-modified monkeys were not distorted. These findings suggest a previously unrecognized form of cortical plasticity that is specific to higher-order processes involved in the discrimination of more complex sounds, such as species-specific vocalizations.


JME3 : Eur J Neurosci. 2006 Aug;24(3):857-66.

Neonatal nicotine exposure impairs nicotinic enhancement of central auditory processing and auditory learning in adult rats.

Liang K, Poytress BS, Chen Y, Leslie FM, Weinberger NM, Metherate R.

Children of women who smoke cigarettes during pregnancy display cognitive deficits in the auditory-verbal domain. Clinical studies have implicated developmental exposure to nicotine, the main psychoactive ingredient of tobacco, as a probable cause of subsequent auditory deficits. To test for a causal link, we have developed an animal model to determine how neonatal nicotine exposure affects adult auditory function. In adult control rats, nicotine administered systemically (0.7 mg/kg, s.c.) enhanced the sensitivity to sound of neural responses recorded in primary auditory cortex. The effect was strongest in cortical layers 3 and 4, where there is a dense concentration of nicotinic acetylcholine receptors (nAChRs) that has been hypothesized to regulate thalamocortical inputs. In support of the hypothesis, microinjection into layer 4 of the nonspecific nAChR antagonist mecamylamine (10 microM) strongly reduced sound-evoked responses. In contrast to the effects of acute nicotine and mecamylamine in adult control animals, neither drug was as effective in adult animals that had been treated with 5 days of chronic nicotine exposure (CNE) shortly after birth. Neonatal CNE also impaired performance on an auditory-cued active avoidance task, while having little effect on basic auditory or motor functions. Thus, neonatal CNE impairs nicotinic regulation of cortical function, and auditory learning, in the adult. Our results provide evidence that developmental nicotine exposure is responsible for auditory-cognitive deficits in the offspring of women who smoke during pregnancy, and suggest a potential underlying mechanism, namely diminished function of cortical nAChRs.


JME4 : Eur J Neurosci. 2006 Jun;23(11):3087-97.

Improved cortical entrainment to infant communication calls in mothers compared with virgin mice.

Liu RC, Linden JF, Schreiner CE.

There is a growing interest in the use of mice as a model system for species-specific communication. In particular, ultrasonic calls emitted by mouse pups communicate distress, and elicit a search and retrieval response from mothers. Behaviorally, mothers prefer and recognize these calls in two-alternative choice tests, in contrast to pup-naive females that do not have experience with pups. Here, we explored whether one particular acoustic feature that defines these calls-- the repetition rate of calls within a bout-- is represented differently in the auditory cortex of these two animal groups. Multiunit recordings in anesthetized CBA/CaJ mice revealed that: (i) neural entrainment to repeated stimuli extended up to the natural pup call repetition rate (5 Hz) in mothers; but (ii) neurons in naive females followed repeated stimuli well only at slower repetition rates; and (iii) entrained responses to repeated pup calls were less sensitive to natural pup call variability in mothers than in pup-naive females. In the broader context, our data suggest that auditory cortical responses to communication sounds are plastic, and that communicative significance is correlated with an improved cortical representation.


JME5 : Proc Natl Acad Sci U S A. 2004 Nov 16;101(46):16351-6.

Associative learning shapes the neural code for stimulus magnitude in primary auditory cortex.

Polley DB, Heiser MA, Blake DT, Schreiner CE, Merzenich MM.

Since the dawn of experimental psychology, researchers have sought an understanding of the fundamental relationship between the amplitude of sensory stimuli and the magnitudes of their perceptual representations. Contemporary theories support the view that magnitude is encoded by a linear increase in firing rate established in the primary afferent pathways. In the present study, we have investigated sound intensity coding in the rat primary auditory cortex (AI) and describe its plasticity by following paired stimulus reinforcement and instrumental conditioning paradigms. In trained animals, population-response strengths in AI became more strongly nonlinear with increasing stimulus intensity. Individual AI responses became selective to more restricted ranges of sound intensities and, as a population, represented a broader range of preferred sound levels. These experiments demonstrate that the representation of stimulus magnitude can be powerfully reshaped by associative learning processes and suggest that the code for sound intensity within AI can be derived from intensity-tuned neurons that change, rather than simply increase, their firing rates in proportion to increases in sound intensity.


JME6 : J Neurosci. 2006 May 3;26(18):4785-95.

Plasticity of temporal pattern codes for vocalization stimuli in primary auditory cortex.

Schnupp JW, Hall TM, Kokelaar RF, Ahmed B.

It has been suggested that "call-selective" neurons may play an important role in the encoding of vocalizations in primary auditory cortex (A1). For example, marmoset A1 neurons often respond more vigorously to natural than to time-reversed twitter calls, although the spectral energy distribution in the natural and time-reversed signals is the same. Neurons recorded in cat A1, in contrast, showed no such selectivity for natural marmoset calls. To investigate whether call selectivity in A1 can arise purely as a result of auditory experience, we recorded responses to marmoset calls in A1 of naive ferrets, as well as in ferrets that had been trained to recognize these natural marmoset calls. We found that training did not induce call selectivity for the trained vocalizations in A1. However, although ferret A1 neurons were not call selective, they efficiently represented the vocalizations through temporal pattern codes, and trained animals recognized marmoset twitters with a high degree of accuracy. These temporal patterns needed to be analyzed at timescales of 10-50 ms to ensure efficient decoding. Training led to a substantial increase in the amount of information transmitted by these temporal discharge patterns, but the fundamental nature of the temporal pattern code remained unaltered. These results emphasize the importance of temporal discharge patterns and cast doubt on the functional significance of call-selective neurons in the processing of animal communication sounds at the level of A1.


JME8 : J Neurosci. 2007 Sep 26;27(39):10372-82.

Auditory cortical receptive fields: stable entities with plastic abilities.

Elhilali M, Fritz JB, Chi TS, Shamma SA.

To form a reliable, consistent, and accurate representation of the acoustic scene, a reasonable conjecture is that cortical neurons maintain stable receptive fields after an early period of developmental plasticity. However, recent studies suggest that cortical neurons can be modified throughout adulthood and may change their response properties quite rapidly to reflect changing behavioral salience of certain sensory features. Because claims of adaptive receptive field plasticity could be confounded by intrinsic, labile properties of receptive fields themselves, we sought to gauge spontaneous changes in the responses of auditory cortical neurons. In the present study, we examined changes in a series of spectrotemporal receptive fields (STRFs) gathered from single neurons in successive recordings obtained over time scales of 30-120 min in primary auditory cortex (A1) in the quiescent, awake ferret. We used a global analysis of STRF shape based on a large database of A1 receptive fields. By clustering this STRF space in a data-driven manner, STRF sequences could be classified as stable or labile. We found that >73% of A1 neurons exhibited stable receptive field attributes over these time scales. In addition, we found that the extent of intrinsic variation in STRFs during the quiescent state was insignificant compared with behaviorally induced STRF changes observed during performance of spectral auditory tasks. Our results confirm that task-related changes induced by attentional focus on specific acoustic features were indeed confined to behaviorally salient acoustic cues and could be convincingly attributed to learning-induced plasticity when compared with "spontaneous" receptive field variability.


JME9 : Network. 2007 Sep;18(3):191-212. Epub 2007 Sep 7.

Estimating sparse spectro-temporal receptive fields with natural stimuli.

David SV, Mesgarani N, Shamma SA.

Several algorithms have been proposed to characterize the spectro-temporal tuning properties of auditory neurons during the presentation of natural stimuli. Algorithms designed to work at realistic signal-to-noise levels must make some prior assumptions about tuning in order to produce accurate fits, and these priors can introduce bias into estimates of tuning. We compare a new, computationally efficient algorithm for estimating tuning properties, boosting, to a more commonly used algorithm, normalized reverse correlation. These algorithms employ the same functional model and cost function, differing only in their priors. We use both algorithms to estimate spectro-temporal tuning properties of neurons in primary auditory cortex during the presentation of continuous human speech. Models estimated using either algorithm, have similar predictive power, although fits by boosting are slightly more accurate. More strikingly, neurons characterized with boosting appear tuned to narrower spectral bandwidths and higher temporal modulation rates than when characterized with normalized reverse correlation. These differences have little impact on responses to speech, which is spectrally broadband and modulated at low rates. However, we find that models estimated by boosting also predict responses to non-speech stimuli more accurately. These findings highlight the crucial role of priors in characterizing neuronal response properties with natural stimuli.


MC1 : J Neurosci. 2004 Apr 7;24(14):3637-42.

Sensitivity to auditory object features in human temporal neocortex.

Zatorre RJ, Bouffard M, Belin P.

This positron emission tomography study examined the hemodynamic response of the human brain to auditory object feature processing. A continuum of object feature variation was created by combining different numbers of stimuli drawn from a diverse sample of 45 environmental sounds. In each 60 sec scan condition, subjects heard either a distinct individual sound on each trial or simultaneous combinations of sounds that varied systematically in their similarity or distinctiveness across conditions. As more stimuli are combined they become more similar and less distinct from one another; the limiting case is when all 45 are added together to form a noise that is repeated on each trial. Analysis of covariation of cerebral blood flow elicited by this parametric manipulation revealed a response in the upper bank of the right anterior superior temporal sulcus (STS): when sounds were identical across trials (i.e., a noise made up of 45 sounds), activity was at a minimum; when stimuli were different from one another, activity was maximal. A right inferior frontal area was also revealed. The results are interpreted as reflecting sensitivity of this region of temporal neocortex to auditory object features, as predicted by neurophysiological and anatomical models implicating an anteroventral functional stream in object processing. The findings also fit with evidence that voice processing may involve regions within the anterior STS. The data are discussed in light of these models and are related to the concept that this functional stream is sensitive to invariant sound features that characterize individual auditory objects.


MC2 : Eur J Neurosci. 2006 Jul;24(2):625-34. Epub 2006 Jul 12.

Object representation in the human auditory system.

Winkler I, van Zuijen TL, Sussman E, Horváth J, Näätänen R.

One important principle of object processing is exclusive allocation. Any part of the sensory input, including the border between two objects, can only belong to one object at a time. We tested whether tones forming a spectro-temporal border between two sound patterns can belong to both patterns at the same time. Sequences were composed of low-, intermediate- and high-pitched tones. Tones were delivered with short onset-to-onset intervals causing the high and low tones to automatically form separate low and high sound streams. The intermediate-pitch tones could be perceived as part of either one or the other stream, but not both streams at the same time. Thus these tones formed a pitch 'border' between the two streams. The tones were presented in a fixed, cyclically repeating order. Linking the intermediate-pitch tones with the high or the low tones resulted in the perception of two different repeating tonal patterns. Participants were instructed to maintain perception of one of the two tone patterns throughout the stimulus sequences. Occasional changes violated either the selected or the alternative tone pattern, but not both at the same time. We found that only violations of the selected pattern elicited the mismatch negativity event-related potential, indicating that only this pattern was represented in the auditory system. This result suggests that individual sounds are processed as part of only one auditory pattern at a time. Thus tones forming a spectro-temporal border are exclusively assigned to one sound object at any given time, as are spatio-temporal borders in vision.


MC3 : Cereb Cortex. 2007 Nov;17(11):2544-52. Epub 2007 Jan 4.

Working memory specific activity in auditory cortex: potential correlates of sequential processing and maintenance.

Brechmann A, Gaschler-Markefski B, Sohr M, Yoneda K, Kaulisch T, Scheich H.

Working memory (WM) tasks involve several interrelated processes during which past information must be transiently maintained, recalled, and compared with test items according to previously instructed rules. It is not clear whether the rule-specific comparisons of perceptual with memorized items are only performed in previously identified frontal and parietal WM areas or whether these areas orchestrate such comparisons by feedback to sensory cortex. We tested the latter hypothesis by focusing on auditory cortex (AC) areas with low-noise functional magnetic resonance imaging in a 2-back WM task involving frequency-modulated (FM) tones. The control condition was a 0-back task on the same stimuli. Analysis of the group data identified an area on right planum temporale equally activated by both tasks and an area on the left planum temporale specifically involved in the 2-back task. A region of interest analysis in each individual revealed that activation on the left planum temporale in the 2-back task positively correlated with the task performance of the subjects. This strongly suggests a prominent role of the AC in 2-back WM tasks. In conjunction with previous findings on FM processing, the left lateralized effect presumably reflects the complex sequential processing demand of the 2-back matching to sample task.


MC4 : Nat Neurosci. 2001 Aug;4(8):839-44.

Human pitch perception is reflected in the timing of stimulus-related cortical activity.

Patel AD, Balaban E.

'Pitch' refers to a sound's subjective highness or lowness, as distinct from 'frequency,' which refers to a sound's physical structure. In speech, music and other natural contexts, complex tones are often perceived with a single pitch. Using whole-head magnetoencephalography (MEG) and stimuli that dissociate pitch from frequency, we studied cortical dynamics in normal individuals who extracted different pitches from the same tone complexes. Whereas all subjects showed similar spatial distributions in the magnitude of their brain responses to the stimuli, subjects who heard different pitches exhibited contrasting temporal patterns of brain activity in their right but not their left hemispheres. These data demonstrate a specific relationship between pitch perception and the timing (phase) of dynamic patterns of cortical activity.


MC5 : Proc Natl Acad Sci U S A. 2006 Sep 26;103(39):14608-13. Epub 2006 Sep 18.

Task-modulated "what" and "where" pathways in human auditory cortex.

Ahveninen J, Jääskeläinen IP, Raij T, Bonmassar G, Devore S, Hämäläinen M, Levänen S, Lin FH, Sams M, Shinn-Cunningham BG, Witzel T, Belliveau JW.

Human neuroimaging studies suggest that localization and identification of relevant auditory objects are accomplished via parallel parietal-to-lateral-prefrontal "where" and anterior-temporal-to-inferior-frontal "what" pathways, respectively. Using combined hemodynamic (functional MRI) and electromagnetic (magnetoencephalography) measurements, we investigated whether such dual pathways exist already in the human nonprimary auditory cortex, as suggested by animal models, and whether selective attention facilitates sound localization and identification by modulating these pathways in a feature-specific fashion. We found a double dissociation in response adaptation to sound pairs with phonetic vs. spatial sound changes, demonstrating that the human nonprimary auditory cortex indeed processes speech-sound identity and location in parallel anterior "what" (in anterolateral Heschl's gyrus, anterior superior temporal gyrus, and posterior planum polare) and posterior "where" (in planum temporale and posterior superior temporal gyrus) pathways as early as approximately 70-150 ms from stimulus onset. Our data further show that the "where" pathway is activated approximately 30 ms earlier than the "what" pathway, possibly enabling the brain to use top-down spatial information in auditory object perception. Notably, selectively attending to phonetic content modulated response adaptation in the "what" pathway, whereas attending to sound location produced analogous effects in the "where" pathway. This finding suggests that selective-attention effects are feature-specific in the human nonprimary auditory cortex and that they arise from enhanced tuning of receptive fields of task-relevant neuronal populations.


MC6 : Nat Neurosci. 2003 Apr;6(4):391-8.

Processing of low-probability sounds by cortical neurons.

Ulanovsky N, Las L, Nelken I.

The ability to detect rare auditory events can be critical for survival. We report here that neurons in cat primary auditory cortex (A1) responded more strongly to a rarely presented sound than to the same sound when it was common. For the rare stimuli, we used both frequency and amplitude deviants. Moreover, some A1 neurons showed hyperacuity for frequency deviants--a frequency resolution one order of magnitude better than receptive field widths in A1. In contrast, auditory thalamic neurons were insensitive to the probability of frequency deviants. These phenomena resulted from stimulus-specific adaptation in A1, which may be a single-neuron correlate of an extensively studied cortical potential--mismatch negativity--that is evoked by rare sounds. Our results thus indicate that A1 neurons, in addition to processing the acoustic features of sounds, may also be involved in sensory memory and novelty detection.