[P2 evaluation] Articles

Choisir deux articles dans la liste, provenant de deux intervenants différents (indiqués par leurs initiales). Les articles de BPC ne peuvent être choisis qu'à l'écrit.


AdC1: J Assoc Res Otolaryngol. 2016 Feb;17(1):69-79.

Pitch Discrimination in Musicians and Non-Musicians: Effects of Harmonic Resolvability and Processing Effort.

Bianchi F, Santurette S, Wendt D, Dau T.

Musicians typically show enhanced pitch discrimination abilities compared to non-musicians. The present study investigated this perceptual enhancement behaviorally and objectively for resolved and unresolved complex tones to clarify whether the enhanced performance in musicians can be ascribed to increased peripheral frequency selectivity and/or to a different processing effort in performing the task. In a first experiment, pitch discrimination thresholds were obtained for harmonic complex tones with fundamental frequencies (F0s) between 100 and 500 Hz, filtered in either a low- or a high-frequency region, leading to variations in the resolvability of audible harmonics. The results showed that pitch discrimination performance in musicians was enhanced for resolved and unresolved complexes to a similar extent. Additionally, the harmonics became resolved at a similar F0 in musicians and non-musicians, suggesting similar peripheral frequency selectivity in the two groups of listeners. In a follow-up experiment, listeners' pupil dilations were measured as an indicator of the required effort in performing the same pitch discrimination task for conditions of varying resolvability and task difficulty. Pupillometry responses indicated a lower processing effort in the musicians versus the non-musicians, although the processing demand imposed by the pitch discrimination task was individually adjusted according to the behavioral thresholds. Overall, these findings indicate that the enhanced pitch discrimination abilities in musicians are unlikely to be related to higher peripheral frequency selectivity and may suggest an enhanced pitch representation at more central stages of the auditory system in musically trained listeners.


AdC2: J Assoc Res Otolaryngol. 2014 Jun;15(3):465-82.

Implications of within-fiber temporal coding for perceptual studies of F0 discrimination and discrimination of harmonic and inharmonic tone complexes.

Kale S, Micheyl C, Heinz MG.

Recent psychophysical studies suggest that normal-hearing (NH) listeners can use acoustic temporal-fine-structure (TFS) cues for accurately discriminating shifts in the fundamental frequency (F0) of complex tones, or equal shifts in all component frequencies, even when the components are peripherally unresolved. The present study quantified both envelope (ENV) and TFS cues in single auditory-nerve (AN) fiber responses (henceforth referred to as neural ENV and TFS cues) from NH chinchillas in response to harmonic and inharmonic complex tones similar to those used in recent psychophysical studies. The lowest component in the tone complex (i.e., harmonic rank N) was systematically varied from 2 to 20 to produce various resolvability conditions in chinchillas (partially resolved to completely unresolved). Neural responses to different pairs of TEST (F0 or frequency shifted) and standard or reference (REF) stimuli were used to compute shuffled cross-correlograms, from which cross-correlation coefficients representing the degree of similarity between responses were derived separately for TFS and ENV. For a given F0 shift, the dissimilarity (TEST vs. REF) was greater for neural TFS than ENV. However, this difference was stimulus-based; the sensitivities of the neural TFS and ENV metrics were equivalent for equal absolute shifts of their relevant frequencies (center component and F0, respectively). For the F0-discrimination task, both ENV and TFS cues were available and could in principle be used for task performance. However, in contrast to human performance, neural TFS cues quantified with our cross-correlation coefficients were unaffected by phase randomization, suggesting that F0 discrimination for unresolved harmonics does not depend solely on TFS cues. For the frequency-shift (harmonic-versus-inharmonic) discrimination task, neural ENV cues were not available. Neural TFS cues were available and could in principle support performance in this task; however, in contrast to human-listeners' performance, these TFS cues showed no dependence on N. We conclude that while AN-fiber responses contain TFS-related cues, which can in principle be used to discriminate changes in F0 or equal shifts in component frequencies of peripherally unresolved harmonics, performance in these two psychophysical tasks appears to be limited by other factors (e.g., central processing noise).


AdC3: J Neurosci. 2016 Jan 27;36(4):1416-28.

Functional Topography of Human Auditory Cortex.

Leaver AM, Rauschecker JP.

Functional and anatomical studies have clearly demonstrated that auditory cortex is populated by multiple subfields. However, functional characterization of those fields has been largely the domain of animal electrophysiology, limiting the extent to which human and animal research can inform each other. In this study, we used high-resolution functional magnetic resonance imaging to characterize human auditory cortical subfields using a variety of low-level acoustic features in the spectral and temporal domains. Specifically, we show that topographic gradients of frequency preference, or tonotopy, extend along two axes in human auditory cortex, thus reconciling historical accounts of a tonotopic axis oriented medial to lateral along Heschl's gyrus and more recent findings emphasizing tonotopic organization along the anterior-posterior axis. Contradictory findings regarding topographic organization according to temporal modulation rate in acoustic stimuli, or "periodotopy," are also addressed. Although isolated subregions show a preference for high rates of amplitude-modulated white noise (AMWN) in our data, large-scale "periodotopic" organization was not found. Organization by AM rate was correlated with dominant pitch percepts in AMWN in many regions. In short, our data expose early auditory cortex chiefly as a frequency analyzer, and spectral frequency, as imposed by the sensory receptor surface in the cochlea, seems to be the dominant feature governing large-scale topographic organization across human auditory cortex.


AdC4: Hear Res. 2016 Jun;336:53-62.

Frequency selectivity of the human cochlea: Suppression tuning of spontaneous otoacoustic emissions.

Manley GA, van Dijk P.

Frequency selectivity is a key functional property of the inner ear and since hearing research began, the frequency resolution of the human ear has been a central question. In contrast to animal studies, which permit invasive recording of neural activity, human studies must rely on indirect methods to determine hearing selectivity. Psychophysical studies, which used masking of a tone by other sounds, indicate a modest frequency selectivity in humans. By contrast, estimates using the phase delays of stimulus-frequency otoacoustic emissions (SFOAE) predict a remarkably high selectivity, unique among mammals. An alternative measure of cochlear frequency selectivity are suppression tuning curves of spontaneous otoacoustic emissions (SOAE). Several animal studies show that these measures are in excellent agreement with neural frequency selectivity. Here we contribute a large data set from normal-hearing young humans on suppression tuning curves (STC) of spontaneous otoacoustic emissions (SOAE). The frequency selectivities of human STC measured near threshold levels agree with the earlier, much lower, psychophysical estimates. They differ, however, from the typical patterns seen in animal auditory nerve data in that the selectivity is remarkably independent of frequency. In addition, SOAE are suppressed by higher-level tones in narrow frequency bands clearly above the main suppression frequencies. These narrow suppression bands suggest interactions between the suppressor tone and a cochlear standing wave corresponding to the SOAE frequency being suppressed. The data show that the relationship between pre-neural mechanical processing in the cochlea and neural coding at the hair-cell/auditory nerve synapse needs to be reconsidered.


AdC5: J Neurosci. 2015 Feb 4;35(5):2058-73

State-dependent population coding in primary auditory cortex.

Pachitariu M, Lyamzin DR, Sahani M, Lesica NA.

Sensory function is mediated by interactions between external stimuli and intrinsic cortical dynamics that are evident in the modulation of evoked responses by cortical state. A number of recent studies across different modalities have demonstrated that the patterns of activity in neuronal populations can vary strongly between synchronized and desynchronized cortical states, i.e., in the presence or absence of intrinsically generated up and down states. Here we investigated the impact of cortical state on the population coding of tones and speech in the primary auditory cortex (A1) of gerbils, and found that responses were qualitatively different in synchronized and desynchronized cortical states. Activity in synchronized A1 was only weakly modulated by sensory input, and the spike patterns evoked by tones and speech were unreliable and constrained to a small range of patterns. In contrast, responses to tones and speech in desynchronized A1 were temporally precise and reliable across trials, and different speech tokens evoked diverse spike patterns with extremely weak noise correlations, allowing responses to be decoded with nearly perfect accuracy. Restricting the analysis of synchronized A1 to activity within up states yielded similar results, suggesting that up states are not equivalent to brief periods of desynchronization. These findings demonstrate that the representational capacity of A1 depends strongly on cortical state, and suggest that cortical state should be considered as an explicit variable in all studies of sensory processing.


AdC6: J Neurosci. 2016 Nov 23;36(47):12010-12026.

Attenuation of Responses to Self-Generated Sounds in Auditory Cortical Neurons.

Rummell BP, Klee JL, Sigurdsson T.

Many of the sounds that we perceive are caused by our own actions, for example when speaking or moving, and must be distinguished from sounds caused by external events. Studies using macroscopic measurements of brain activity in human subjects have consistently shown that responses to self-generated sounds are attenuated in amplitude. However, the underlying manifestation of this phenomenon at the cellular level is not well understood. To address this, we recorded the activity of neurons in the auditory cortex of mice in response to sounds generated by their own behavior. We found that the responses of auditory cortical neurons to these self-generated sounds were consistently attenuated, compared with the same sounds generated independently of the animals' behavior. This effect was observed in both putative pyramidal neurons and in interneurons and was stronger in lower layers of auditory cortex. Downstream of the auditory cortex, we found that responses of hippocampal neurons to self-generated sounds were almost entirely suppressed. Responses to self-generated optogenetic stimulation of auditory thalamocortical terminals were also attenuated, suggesting a cortical contribution to this effect. Further analyses revealed that the attenuation of self-generated sounds was not simply due to the nonspecific effects of movement or behavioral state on auditory responsiveness. However, the strength of attenuation depended on the degree to which self-generated sounds were expected to occur, in a cell-type-specific manner. Together, these results reveal the cellular basis underlying attenuated responses to self-generated sounds and suggest that predictive processes contribute to this effect.


AdC7: Neuroscience. 2015 Aug 6;300:325-37.

Descending and tonotopic projection patterns from the auditory cortex to the inferior colliculus.

Straka MM, Hughes R, Lee P, Lim HH.

The inferior colliculus (IC) receives many corticofugal projections, which can mediate plastic changes such as shifts in frequency tuning or excitability of IC neurons. While the densest projections are found in the IC's external cortices, fibers originating from the primary auditory cortex (AI) have been observed throughout the IC's central nucleus (ICC), and these projections have shown to be organized tonotopically. Some studies have also found projections from other core and non-core cortical regions, though the organization and function of these projections are less known. In guinea pig, there exists a non-core ventrorostral belt (VRB) region that has primary-like properties and has often been mistaken for AI, with the clearest differentiating characteristic being VRB's longer response latencies. To better understand the auditory corticofugal descending system beyond AI, we investigated if there are projections from VRB to the ICC and if they exhibit a different projection pattern than those from AI. In this study, we performed experiments in ketamine-anesthetized guinea pigs, in which we positioned 32-site electrode arrays within AI, VRB, and ICC. We identified the monosynaptic connections between AI-to-ICC and VRB-to-ICC using an antidromic stimulation method, and we analyzed their locations across the midbrain using three-dimensional histological techniques. Compared to the corticocollicular projections to the ICC from AI, there were fewer projections to the ICC from VRB, and these projections had a weaker tonotopic organization. The majority of VRB projections were observed in the caudal-medial versus the rostral-lateral region along an isofrequency lamina of the ICC, which is in contrast to the AI projections that were scattered throughout an ICC lamina. These findings suggest that the VRB directly modulates sound information within the ascending lemniscal pathway with a different or complementary role compared to the modulatory effects of AI, which may have implications for treating hearing disorders.


AdC8: Hear Res. 2016 Dec;342:112-123.

Musicians' edge: A comparison of auditory processing, cognitive abilities and statistical learning.

Mandikal Vasuki PR, Sharma M, Demuth K, Arciuli J.

It has been hypothesized that musical expertise is associated with enhanced auditory processing and cognitive abilities. Recent research has examined the relationship between musicians' advantage and implicit statistical learning skills. In the present study, we assessed a variety of auditory processing skills, cognitive processing skills, and statistical learning (auditory and visual forms) in age-matched musicians (N = 17) and non-musicians (N = 18). Musicians had significantly better performance than non-musicians on frequency discrimination, and backward digit span. A key finding was that musicians had better auditory, but not visual, statistical learning than non-musicians. Performance on the statistical learning tasks was not correlated with performance on auditory and cognitive measures. Musicians' superior performance on auditory (but not visual) statistical learning suggests that musical expertise is associated with an enhanced ability to detect statistical regularities in auditory stimuli.


AdC9: J Neurosci. 2015 Mar 4;35(9):3815-24.

Attending to pitch information inhibits processing of pitch information: the curious case of amusia.

Zendel BR, Lagrois ME, Robitaille N, Peretz I.

In normal listeners, the tonal rules of music guide musical expectancy. In a minority of individuals, known as amusics, the processing of tonality is disordered, which results in severe musical deficits. It has been shown that the tonal rules of music are neurally encoded, but not consciously available in amusics. Previous neurophysiological studies have not explicitly controlled the level of attention in tasks where participants ignored the tonal structure of the stimuli. Here, we test whether access to tonal knowledge can be demonstrated in congenital amusia when attention is controlled. Electric brain responses were recorded while asking participants to detect an individually adjusted near-threshold click in a melody. In half the melodies, a note was inserted that violated the tonal rules of music. In a second task, participants were presented with the same melodies but were required to detect the tonal deviation. Both tasks required sustained attention, thus conscious access to the rules of tonality was manipulated. In the click-detection task, the pitch deviants evoked an early right anterior negativity (ERAN) in both groups. In the pitch-detection task, the pitch deviants evoked an ERAN and P600 in controls but not in amusics. These results indicate that pitch regularities are represented in the cortex of amusics, but are not consciously available. Moreover, performing a pitch-judgment task eliminated the ERAN in amusics, suggesting that attending to pitch information interferes with perception of pitch. We propose that an impaired top-down frontotemporal projection is responsible for this disorder.


BPC1: Cereb Cortex. 2013 Sep;23(9):2038-43.

Music training for the development of speech segmentation.

François C, Chobert J, Besson M, Schön D.

The role of music training in fostering brain plasticity and developing high cognitive skills, notably linguistic abilities, is of great interest from both a scientific and a societal perspective. Here, we report results of a longitudinal study over 2 years using both behavioral and electrophysiological measures and a test-training-retest procedure to examine the influence of music training on speech segmentation in 8-year-old children. Children were pseudo-randomly assigned to either music or painting training and were tested on their ability to extract meaningless words from a continuous flow of nonsense syllables. While no between-group differences were found before training, both behavioral and electrophysiological measures showed improved speech segmentation skills across testing sessions for the music group only. These results show that music training directly causes facilitation in speech segmentation, thereby pointing to the importance of music for speech perception and more generally for children's language development. Finally these results have strong implications for promoting the development of music-based remediation strategies for children with language-based learning impairments.


BPC2: Sci Rep. 2016 Jan 19;6:19064.

Effects of veridical expectations on syntax processing in music: Event-related potential evidence.

Guo S, Koelsch S.

Numerous past studies have investigated neurophysiological correlates of music-syntactic processing. However, only little is known about how prior knowledge about an upcoming syntactically irregular event modulates brain correlates of music-syntactic processing. Two versions of a short chord sequence were presented repeatedly to non-musicians (n = 20) and musicians (n = 20). One sequence version ended on a syntactically regular chord, and the other one ended on a syntactically irregular chord. Participants were either informed (cued condition), or not informed (non-cued condition) about whether the sequence would end on the regular or the irregular chord. Results indicate that in the cued condition (compared to the non-cued condition) the peak latency of the early right anterior negativity (ERAN), elicited by irregular chords, was earlier in both non-musicians and musicians. However, the expectations due to the knowledge about the upcoming event (veridical expectations) did not influence the amplitude of the ERAN. These results suggest that veridical expectations modulate only the speed, but not the principle mechanisms, of music-syntactic processing.


BPC3: Neuroimage. 2015 Jan 1;104:386-97.

Electrophysiological evidence for a specific neural correlate of musical violation expectation in primary-school children.

James CE, Cereghetti DM, Roullet Tribes E, Oechslin MS.

The majority of studies on music processing in children used simple musical stimuli. Here, primary schoolchildren judged the appropriateness of musical closure in expressive polyphone music, while high-density electroencephalography was recorded. Stimuli ended either regularly or contained refined in-key harmonic transgressions at closure. The children discriminated the transgressions well above chance. Regular and transgressed endings evoked opposite scalp voltage configurations peaking around 400ms after stimulus onset with bilateral frontal negativity for regular and centro-posterior negativity (CPN) for transgressed endings. A positive correlation could be established between strength of the CPN response and rater sensitivity (d-prime). We also investigated whether the capacity to discriminate the transgressions was supported by auditory domain specific or general cognitive mechanisms, and found that working memory capacity predicted transgression discrimination. Latency and distribution of the CPN are reminiscent of the N400, typically observed in response to semantic incongruities in language. Therefore our observation is intriguing, as the CPN occurred here within an intra-musical context, without any symbols referring to the external world. Moreover, the harmonic in-key transgressions that we implemented may be considered syntactical as they transgress structural rules. Such structural incongruities in music are typically followed by an early right anterior negativity (ERAN) and an N5, but not so here. Putative contributive sources of the CPN were localized in left pre-motor, mid-posterior cingulate and superior parietal regions of the brain that can be linked to integration processing. These results suggest that, at least in children, processing of syntax and meaning may coincide in complex intra-musical contexts.


BPC4: R Soc Open Sci. 2016 Feb 3;3(2):150685

Language influences music harmony perception: effects of shared syntactic integration resources beyond attention.

Kunert R, Willems RM, Hagoort P.

Many studies have revealed shared music-language processing resources by finding an influence of music harmony manipulations on concurrent language processing. However, the nature of the shared resources has remained ambiguous. They have been argued to be syntax specific and thus due to shared syntactic integration resources. An alternative view regards them as related to general attention and, thus, not specific to syntax. The present experiments evaluated these accounts by investigating the influence of language on music. Participants were asked to provide closure judgements on harmonic sequences in order to assess the appropriateness of sequence endings. At the same time participants read syntactic garden-path sentences. Closure judgements revealed a change in harmonic processing as the result of reading a syntactically challenging word. We found no influence of an arithmetic control manipulation (experiment 1) or semantic garden-path sentences (experiment 2). Our results provide behavioural evidence for a specific influence of linguistic syntax processing on musical harmony judgements. A closer look reveals that the shared resources appear to be needed to hold a harmonic key online in some form of syntactic working memory or unification workspace related to the integration of chords and words. Overall, our results support the syntax specificity of shared music-language processing resources.


BPC5: Neuropsychol Rehabil. 2014;24(6):894-917.

Learning sung lyrics aids retention in normal ageing and Alzheimer's disease.

Moussard A, Bigand E, Belleville S, Peretz I.

Previous studies have suggested that presenting to-be-memorised lyrics in a singing mode, instead of a speaking mode, may facilitate learning and retention in normal adults. In this study, seven healthy older adults and eight participants with mild Alzheimer's disease (AD) learned and memorised lyrics that were either sung or spoken. We measured the percentage of words recalled from these lyrics immediately and after 10 minutes. Moreover, in AD participants, we tested the effect of successive learning episodes for one spoken and one sung excerpt, as well as long-term retention after a four week delay. Sung conditions did not influence lyrics recall in immediate recall but increased delayed recall for both groups. In AD, learning slopes for sung and spoken lyrics did not show a significant difference across successive learning episodes. However, sung lyrics showed a slight advantage over spoken ones after a four week delay. These results suggest that singing may increase the load of initial learning but improve long-term retention of newly acquired verbal information. We further propose some recommendations on how to maximise these effects and make them relevant for therapeutic applications.


BPC6: J Neurosci. 2011 Mar 9;31(10):3843-52.

Functional anatomy of language and music perception: temporal and structural factors investigated using functional magnetic resonance imaging.

Rogalsky C, Rong F, Saberi K, Hickok G.

Language and music exhibit similar acoustic and structural properties, and both appear to be uniquely human. Several recent studies suggest that speech and music perception recruit shared computational systems, and a common substrate in Broca's area for hierarchical processing has recently been proposed. However, this claim has not been tested by directly comparing the spatial distribution of activations to speech and music processing within subjects. In the present study, participants listened to sentences, scrambled sentences, and novel melodies. As expected, large swaths of activation for both sentences and melodies were found bilaterally in the superior temporal lobe, overlapping in portions of auditory cortex. However, substantial nonoverlap was also found: sentences elicited more ventrolateral activation, whereas the melodies elicited a more dorsomedial pattern, extending into the parietal lobe. Multivariate pattern classification analyses indicate that even within the regions of blood oxygenation level-dependent response overlap, speech and music elicit distinguishable patterns of activation. Regions involved in processing hierarchical aspects of sentence perception were identified by contrasting sentences with scrambled sentences, revealing a bilateral temporal lobe network. Music perception showed no overlap whatsoever with this network. Broca's area was not robustly activated by any stimulus type. Overall, these findings suggest that basic hierarchical processing for music and speech recruits distinct cortical networks, neither of which involves Broca's area. We suggest that previous claims are based on data from tasks that tap higher-order cognitive processes, such as working memory and/or cognitive control, which can operate in both speech and music domains.


CL1: Trends Hear. 2016 Sep 7;20. pii: 2331216516655793.

Complex-Tone Pitch Discrimination in Listeners With Sensorineural Hearing Loss.

Bianchi F, Fereczkowski M, Zaar J, Santurette S, Dau T.

Physiological studies have shown that noise-induced sensorineural hearing loss (SNHL) enhances the amplitude of envelope coding in auditory-nerve fibers. As pitch coding of unresolved complex tones is assumed to rely on temporal envelope coding mechanisms, this study investigated pitch-discrimination performance in listeners with SNHL. Pitch-discrimination thresholds were obtained for 14 normal-hearing (NH) and 10 hearing-impaired (HI) listeners for sine-phase (SP) and random-phase (RP) complex tones. When all harmonics were unresolved, the HI listeners performed, on average, worse than NH listeners in the RP condition but similarly to NH listeners in the SP condition. The increase in pitch-discrimination performance for the SP relative to the RP condition (F0DL ratio) was significantly larger in the HI as compared with the NH listeners. Cochlear compression and auditory-filter bandwidths were estimated in the same listeners. The estimated reduction of cochlear compression was significantly correlated with the increase in the F0DL ratio, while no correlation was found with filter bandwidth. The effects of degraded frequency selectivity and loss of compression were considered in a simplified peripheral model as potential factors in envelope enhancement. The model revealed that reducing cochlear compression significantly enhanced the envelope of an unresolved SP complex tone, while not affecting the envelope of a RP complex tone. This envelope enhancement in the SP condition was significantly correlated with the increased pitch-discrimination performance for the SP relative to the RP condition in the HI listeners.


CL2: Nat Neurosci. 2012 Oct;15(10):1362-4

Diminished temporal coding with sensorineural hearing loss emerges in background noise.

Henry KS, Heinz MG.

Behavioral studies in humans suggest that sensorineural hearing loss (SNHL) decreases sensitivity to the temporal structure of sound, but neurophysiological studies in mammals provide little evidence for diminished temporal coding. We found that SNHL in chinchillas degraded peripheral temporal coding in background noise substantially more than in quiet. These results resolve discrepancies between previous studies and help to explain why perceptual difficulties in hearing-impaired listeners often emerge in noisy situations.


CL3: J Neurosci. 2016 Feb 17;36(7):2227-37

Distorted Tonotopic Coding of Temporal Envelope and Fine Structure with Noise-Induced Hearing Loss.

Henry KS, Kale S, Heinz MG.

People with cochlear hearing loss have substantial difficulty understanding speech in real-world listening environments (e.g., restaurants), even with amplification from a modern digital hearing aid. Unfortunately, a disconnect remains between human perceptual studies implicating diminished sensitivity to fast acoustic temporal fine structure (TFS) and animal studies showing minimal changes in neural coding of TFS or slower envelope (ENV) structure. Here, we used general system-identification (Wiener kernel) analyses of chinchilla auditory nerve fiber responses to Gaussian noise to reveal pronounced distortions in tonotopic coding of TFS and ENV following permanent, noise-induced hearing loss. In basal fibers with characteristic frequencies (CFs) >1.5 kHz, hearing loss introduced robust nontonotopic coding (i.e., at the wrong cochlear place) of low-frequency TFS, while ENV responses typically remained at CF. As a consequence, the highest dominant frequency of TFS coding in response to Gaussian noise was 2.4 kHz in noise-overexposed fibers compared with 4.5 kHz in control fibers. Coding of ENV also became nontonotopic in more pronounced cases of cochlear damage. In apical fibers, more classical hearing-loss effects were observed, i.e., broadened tuning without a significant shift in best frequency. Because these distortions and dissociations of TFS/ENV disrupt tonotopicity, a fundamental principle of auditory processing necessary for robust signal coding in background noise, these results have important implications for understanding communication difficulties faced by people with hearing loss. Further, hearing aids may benefit from distinct amplification strategies for apical and basal cochlear regions to address fundamentally different coding deficits.


CL4: Trends Hear. 2016 Sep 7;20. pii: 2331216516641055.

The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners.

Johannesen PT, Pérez-González P, Kalluri S, Blanco JL, Lopez-Poveda EA.

The aim of this study was to assess the relative importance of cochlear mechanical dysfunction, temporal processing deficits, and age on the ability of hearing-impaired listeners to understand speech in noisy backgrounds. Sixty-eight listeners took part in the study. They were provided with linear, frequency-specific amplification to compensate for their audiometric losses, and intelligibility was assessed for speech-shaped noise (SSN) and a time-reversed two-talker masker (R2TM). Behavioral estimates of cochlear gain loss and residual compression were available from a previous study and were used as indicators of cochlear mechanical dysfunction. Temporal processing abilities were assessed using frequency modulation detection thresholds. Age, audiometric thresholds, and the difference between audiometric threshold and cochlear gain loss were also included in the analyses. Stepwise multiple linear regression models were used to assess the relative importance of the various factors for intelligibility. Results showed that (a) cochlear gain loss was unrelated to intelligibility, (b) residual cochlear compression was related to intelligibility in SSN but not in a R2TM, (c) temporal processing was strongly related to intelligibility in a R2TM and much less so in SSN, and (d) age per se impaired intelligibility. In summary, all factors affected intelligibility, but their relative importance varied across maskers.


CL5: J Assoc Res Otolaryngol. 2015 Jun;16(3):389-99

Acoustic temporal modulation detection in normal-hearing and cochlear implanted listeners: effects of hearing mechanism and development.

Park MH, Won JH, Horn DL, Rubinstein JT.

Temporal modulation detection ability matures over many years after birth and may be particularly sensitive to experience during this period. Profound hearing loss during early childhood might result in greater perceptual deficits than a similar loss beginning in adulthood. We tested this idea by measuring performance in temporal modulation detection in profoundly deaf children and adults fitted with cochlear implants (CIs). At least two independent variables could constrain temporal modulation detection performance in children with CIs: altered encoding of modulation information due to the CI-auditory nerve interface, and atypical development of central processing of sound information provided by CIs. The effect of altered encoding was investigated by testing subjects with one of two different hearing mechanisms (normal hearing vs. CI) and the effect of atypical development was studied by testing two different age groups. All subjects were tested for their ability to detect acoustic temporal modulations of sound amplitude. A comparison of the slope, or cutoff frequency, of the temporal modulation transfer functions (TMTFs) among the four subject groups revealed that temporal resolution was mainly constrained by hearing mechanism: normal-hearing listeners could detect smaller amplitude modulations at high modulation frequencies than CI users. In contrast, a comparison of the height of the TMTFs revealed a significant interaction between hearing mechanism and age group on overall sensitivity to temporal modulation: sensitivity was significantly poorer in children with CIs, relative to the other three groups. Results suggest that there is an age-specific vulnerability of intensity discrimination or non-sensory factors, which subsequently affects sensitivity to temporal modulation in prelingually deaf children who use CIs.


DP1: Atten Percept Psychophys. 2015 Apr;77(3):896-906

Auditory frequency perception adapts rapidly to the immediate past.

Alais D, Orchard-Mills E, Van der Burg E.

Frequency modulation is critical to human speech. Evidence from psychophysics, neurophysiology, and neuroimaging suggests that there are neuronal populations tuned to this property of speech. Consistent with this, extended exposure to frequency change produces direction specific aftereffects in frequency change detection. We show that this aftereffect occurs extremely rapidly, requiring only a single trial of just 100-ms duration. We demonstrate this using a long, randomized series of frequency sweeps (both upward and downward, by varying amounts) and analyzing intertrial adaptation effects. We show the point of constant frequency is shifted systematically towards the previous trial's sweep direction (i.e., a frequency sweep aftereffect). Furthermore, the perception of glide direction is also independently influenced by the glide presented two trials previously. The aftereffect is frequency tuned, as exposure to a frequency sweep from a set centered on 1,000 Hz does not influence a subsequent trial drawn from a set centered on 400 Hz. More generally, the rapidity of adaptation suggests the auditory system is constantly adapting and "tuning" itself to the most recent environmental conditions.


DP2: PLoS One. 2015 Dec 15;10(12):e0144788

Auditory Streaming as an Online Classification Process with Evidence Accumulation.

Barniv D, Nelken I.

When human subjects hear a sequence of two alternating pure tones, they often perceive it in one of two ways: as one integrated sequence (a single "stream" consisting of the two tones), or as two segregated sequences, one sequence of low tones perceived separately from another sequence of high tones (two "streams"). Perception of this stimulus is thus bistable. Moreover, subjects report on-going switching between the two percepts: unless the frequency separation is large, initial perception tends to be of integration, followed by toggling between integration and segregation phases. The process of stream formation is loosely named "auditory streaming". Auditory streaming is believed to be a manifestation of human ability to analyze an auditory scene, i.e. to attribute portions of the incoming sound sequence to distinct sound generating entities. Previous studies suggested that the durations of the successive integration and segregation phases are statistically independent. This independence plays an important role in current models of bistability. Contrary to this, we show here, by analyzing a large set of data, that subsequent phase durations are positively correlated. To account together for bistability and positive correlation between subsequent durations, we suggest that streaming is a consequence of an evidence accumulation process. Evidence for segregation is accumulated during the integration phase and vice versa; a switch to the opposite percept occurs stochastically based on this evidence. During a long phase, a large amount of evidence for the opposite percept is accumulated, resulting in a long subsequent phase. In contrast, a short phase is followed by another short phase. We implement these concepts using a probabilistic model that shows both bistability and correlations similar to those observed experimentally.


DP3: Brain Res. 2016 Sep 1;1646:84-90.

Auditory perceptual restoration and illusory continuity correlates in the human brainstem.

Bidelman GM, Patro C.

When noise obstructs portions of target sounds the auditory system fills in missing information, a phenomenon known as auditory restoration or induction. Previous work in animal models demonstrates that neurons in primary auditory cortex (A1) are capable of restoring occluded target signals suggesting that early auditory cortex is capable of inducing continuity in discontinuous signals (i.e., endogenous restoration). Current consensus is that the neural correlates of auditory induction and perceptual restoration emerge no earlier than A1. Moreover, the neural mechanisms supporting induction in humans are poorly understood. Here, we show that in human listeners, auditory brainstem nuclei support illusory auditory continuity well before engagement of cerebral cortex. We recorded brainstem responses to modulated target tones that did or did not promote illusory auditory percepts. Auditory continuity was manipulated by introducing masking noise or brief temporal interruptions in otherwise continuous tones. We found that auditory brainstem responses paralleled illusory continuity by tagging target sounds even when they were occluded by the auditory scene. Our results reveal (i) a pre-attentive, subcortical origin to a presumed cortical function and (ii) that brainstem signal processing helps partially cancel the negative effects of masking by restoring missing portions of auditory objects that are fragmented in the soundscape.


DP4: J Acoust Soc Am. 2016 Aug;140(2):866.

Multistable perception of ambiguous melodies and the role of musical expertise.

Brosowsky NP, Mondor TA.

Whereas visual demonstrations of multistability are ubiquitous, there are few auditory examples. The purpose of the current study was to determine whether simultaneously presented melodies, such as underlie the scale illusion [Deutsch (1975). J. Acoust. Soc. Am. 57(5), 1156-1160], can elicit multiple mutually exclusive percepts, and whether reported perceptions are mediated by musical expertise. Participants listened to target melodies and reported whether the target was embedded in subsequent test melodies. Target sequences were created such that they would only be heard if the listener interpreted the test melody according to various perceptual cues. Critically, and in contrast with previous examinations of the scale illusion, an objective measure of target detection was obtained by including target-absent test melodies. As a result, listeners could reliably identify target sequences from different perceptual organizations when presented with the same test melody on different trials. This result demonstrates an ability to alternate between mutually exclusive percepts of an unchanged stimulus. However, only perceptual organizations consistent with frequency and spatial cues were available and musical expertise did mediate target detection, limiting the organizations available to non-musicians. The current study provides the first known demonstration of auditory multistability using simultaneously presented melodies and provides a unique experimental method for measuring auditory perceptual competition.


DP5: Exp Brain Res. 2014 Dec;232(12):3813-20

The influence of horizontally rotating sound on standing balance.

Gandemer L, Parseihian G, Kronland-Martinet R, Bourdin C.

Postural control is known to be the result of the integration and processing of various sensory inputs by the central nervous system. Among the various afferent inputs, the role of auditory information in postural regulation has been addressed in relatively few studies, which led to conflicting results. The purpose of the present study was to investigate the influence of a rotating auditory stimulus, delivered by an immersive 3D sound spatialization system, on the standing posture of young subjects. The postural sway of 20 upright, blindfolded subjects was recorded using a force platform. Use of various sound source rotation velocities followed by sudden immobilization of the sound was compared with two control conditions: no sound and a stationary sound source. The experiment showed that subjects reduced their body sway amplitude and velocity in the presence of rotating sound compared with the control conditions. The faster the sound source was rotating, the greater the reduction in subject body sway. Moreover, disruption of subject postural regulation was observed as soon as the sound source was immobilized. These results suggest that auditory information cannot be neglected in postural control and that it acts as additional information influencing postural regulation.


DP6: J Assoc Res Otolaryngol. 2016 Sep 29. [Epub ahead of print]

Temporal Regularity Detection and Rate Discrimination in Cochlear-Implant Listeners.

Gaudrain E, Deeks JM, Carlyon RP.

Cochlear implants (CIs) convey fundamental-frequency information using primarily temporal cues. However, temporal pitch perception in CI users is weak and, when measured using rate discrimination tasks, deteriorates markedly as the rate increases beyond 300 pulses-per-second. Rate pitch may be weak because the electrical stimulation of the surviving neural population of the implant recipient may not allow accurate coding of inter-pulse time intervals. If so, this phenomenon should prevent listeners from detecting when a pulse train is physically temporally jittered. Performance in a jitter detection task was compared to that in a rate-pitch discrimination task. Stimuli were delivered using direct stimulation in cochlear implants, on a mid-array and an apical electrode, and at two different rates (100 and 300 pps). Average performance on both tasks was worse at the higher pulse rate and did not depend on electrode. However, there was a large variability across and within listeners that did not correlate between the two tasks, suggesting that rate-pitch judgement and regularity detection are to some extent limited by task-specific processes. Simulations with filtered pulse trains presented to NH listeners yielded broadly similar results, except that, for the rate discrimination task, the difference between performance with 100- and 300-pps base rates was smaller than observed for CI users.


DP7: Proc. R. Soc. A. 2015. 471: 20150309.

A single microphone noise reduction algorithm based on the detection and reconstruction of spectro-temporal features.

Lee T, Theunissen F.

Animals throughout the animal kingdom excel at extracting individual sounds from competing background sounds, yet current state-of-the-art signal processing algorithms struggle to process speech in the presence of even modest background noise. Recent psychophysical experiments in humans and electrophysiological recordings in animal models suggest that the brain is adapted to process sounds within the restricted domain of spectro-temporal modulations found in natural sounds. Here, we describe a novel single microphone noise reduction algorithm called spectro-temporal detection– reconstruction (STDR) that relies on an artificial neural network trained to detect, extract and reconstruct the spectro-temporal features found in speech. STDR can significantly reduce the level of the background noise while preserving the foreground speech quality and improving estimates of speech intelligibility. In addition, by leveraging the strong temporal correlations present in speech, the STDR algorithm can also operate on predictions of upcoming speech features, retaining similar performance levels while minimizing inherent throughput delays. STDR performs better than a competing state-of-the-art algorithm for a wide range of signal-to-noise ratios and has the potential for real-time applications such as hearing aids and automatic speech recognition.


DP8: Nature. 2016 Jul 28;535(7613):547-50.

Indifference to dissonance in native Amazonians reveals cultural variation in music perception.

McDermott JH, Schultz AF, Undurraga EA, Godoy RA.

by biology remains debated. One widely discussed phenomenon is that some combinations of notes are perceived by Westerners as pleasant, or consonant, whereas others are perceived as unpleasant,or dissonant. The contrast between consonance and dissonance is central to Western music and its origins have fascinated scholars since the ancient Greeks. Aesthetic responses to consonance are commonly assumed by scientists to have biological roots, and thus to be universally present in humans. Ethnomusicologists and composers, in contrast, have argued that consonance is a creation of Western musical culture. The issue has remained unresolved, partly because little is known about the extent of cross-cultural variation in consonance preferences. Here we report experiments with the Tsimane'--a native Amazonian society with minimal exposure to Western culture--and comparison populations in Bolivia and the United States that varied in exposure to Western music. Participants rated the pleasantness of sounds. Despite exhibiting Western-like discrimination abilities and Western-like aesthetic responses to familiar sounds and acoustic roughness, the Tsimane' rated consonant and dissonant chords and vocal harmonies as equally pleasant. By contrast, Bolivian city- and town-dwellers exhibited significant preferences for consonance,albeit to a lesser degree than US residents. The results indicate that consonance preferences can be absent in cultures sufficiently isolated from Western music, and are thus unlikely to reflect innate biases or exposure to harmonic natural sounds. The observed variation in preferences is presumably determined by exposure to musical harmony, suggesting that culture has a dominant role in shaping aesthetic responses to music.


DP9: J Acoust Soc Am. 1990 Apr;87(4):1695-701.

Perception of temporal patterns defined by tonal sequences.

Sorkin RD

This experiment tested how listeners discriminate between the temporal patterns defined by two sequences of tones. Two arrhythmic sequences of n tones were played successively (n = 8, 12, or 16, tone duration = 35 ms, frequency = 1000 Hz), and the listener reported whether the sequences had the same or different temporal patterns. In the first sequence, the durations of the intertone gaps were chosen at random; in the second sequence, the gaps were either (a) the same as the first sequence or (b) chosen at random. Discrimination performance increased with the variability of the gap sequences and decreased with the size of the correlation between the sequences. A discrimination model based on computation of the sample correlation between the sequences of gaps, but limited by an internal variability of approximately 15 ms, described observer performance in a variety of conditions.


DP10: Sci Rep. 2015 Jun 26;5:11628.

Musical training, individual differences and the cocktail party problem.

Swaminathan J, Mason CR, Streeter TM, Best V, Kidd G Jr, Patel AD.

Are musicians better able to understand speech in noise than non-musicians? Recent findings have produced contradictory results. Here we addressed this question by asking musicians and non-musicians to understand target sentences masked by other sentences presented from different spatial locations, the classical 'cocktail party problem' in speech science. We found that musicians obtained a substantial benefit in this situation, with thresholds ~6 dB better than non-musicians. Large individual differences in performance were noted particularly for the non-musically trained group. Furthermore, in different conditions we manipulated the spatial location and intelligibility of the masking sentences, thus changing the amount of 'informational masking' (IM) while keeping the amount of 'energetic masking' (EM) relatively constant. When the maskers were unintelligible and spatially separated from the target (low in IM), musicians and non-musicians performed comparably. These results suggest that the characteristics of speech maskers and the amount of IM can influence the magnitude of the differences found between musicians and non-musicians in multiple-talker "cocktail party" environments. Furthermore, considering the task in terms of the EM-IM distinction provides a conceptual framework for future behavioral and neuroscientific studies which explore the underlying sensory and cognitive mechanisms contributing to enhanced "speech-in-noise" perception by musicians.


DP11: J Acoust Soc Am. 2016 Aug;140(2):1072. doi: 10.1121/1.4960544.

Auditory "bubbles": Efficient classification of the spectrotemporal modulations essential for speech intelligibility.

Venezia JH, Hickok G, Richards VM.

Speech intelligibility depends on the integrity of spectrotemporal patterns in the signal. The current study is concerned with the speech modulation power spectrum (MPS), which is a two-dimensional representation of energy at different combinations of temporal and spectral (i.e., spectrotemporal) modulation rates. A psychophysical procedure was developed to identify the regions of the MPS that contribute to successful reception of auditory sentences. The procedure, based on the two-dimensional image classification technique known as "bubbles" (Gosselin and Schyns (2001). Vision Res. 41, 2261-2271), involves filtering (i.e., degrading) the speech signal by removing parts of the MPS at random, and relating filter patterns to observer performance (keywords identified) over a number of trials. The result is a classification image (CImg) or "perceptual map" that emphasizes regions of the MPS essential for speech intelligibility. This procedure was tested using normal-rate and 2×-time-compressed sentences. The results indicated: (a) CImgs could be reliably estimated in individual listeners in relatively few trials, (b) CImgs tracked changes in spectrotemporal modulation energy induced by time compression, though not completely, indicating that "perceptual maps" deviated from physical stimulus energy, and c) the bubbles method captured variance in intelligibility not reflected in a common modulation-based intelligibility metric (spectrotemporal modulation index or STMI).


DP12: Front Neurosci. 2016 Nov 24;10:490.

Long Term Memory for Noise: Evidence of Robust Encoding of Very Short Temporal Acoustic Patterns.

Viswanathan J, Rémy F, Bacon-Macé N, Thorpe SJ.

Recent research has demonstrated that humans are able to implicitly encode and retain repeating patterns in meaningless auditory noise. Our study aimed at testing the robustness of long-term implicit recognition memory for these learned patterns. Participants performed a cyclic/non-cyclic discrimination task, during which they were presented with either 1-s cyclic noises (CNs) (the two halves of the noise were identical) or 1-s plain random noises (Ns). Among CNs and Ns presented once, target CNs were implicitly presented multiple times within a block, and implicit recognition of these target CNs was tested 4 weeks later using a similar cyclic/non-cyclic discrimination task. Furthermore, robustness of implicit recognition memory was tested by presenting participants with looped (shifting the origin) and scrambled (chopping sounds into 10- and 20-ms bits before shuffling) versions of the target CNs. We found that participants had robust implicit recognition memory for learned noise patterns after 4 weeks, right from the first presentation. Additionally, this memory was remarkably resistant to acoustic transformations, such as looping and scrambling of the sounds. Finally, implicit recognition of sounds was dependent on participant's discrimination performance during learning. Our findings suggest that meaningless temporal features as short as 10 ms can be implicitly stored in long-term auditory memory. Moreover, successful encoding and storage of such fine features may vary between participants, possibly depending on individual attention and auditory discrimination abilities. Significance Statement Meaningless auditory patterns could be implicitly encoded and stored in long-term memory.Acoustic transformations of learned meaningless patterns could be implicitly recognized after 4 weeks.Implicit long-term memories can be formed for meaningless auditory features as short as 10 ms.Successful encoding and long-term implicit recognition of meaningless patterns may strongly depend on individual attention and auditory discrimination abilities.


DP13: Curr Biol. 2015 Aug 3;25(15):2051-6.

Human screams occupy a privileged niche in the communication soundscape.

Arnal LH, Flinker A, Kleinschmidt A, Giraud AL, Poeppel D.

Screaming is arguably one of the most relevant communication signals for survival in humans. Despite their practical relevance and their theoretical significance as innate [1] and virtually universal [2, 3] vocalizations, what makes screams a unique signal and how they are processed is not known. Here, we use acoustic analyses, psychophysical experiments, and neuroimaging to isolate those features that confer to screams their alarming nature, and we track their processing in the human brain. Using the modulation power spectrum (MPS [4, 5]), a recently developed, neurally informed characterization of sounds, we demonstrate that human screams cluster within restricted portion of the acoustic space (between ∼30 and 150 Hz modulation rates) that corresponds to a well-known perceptual attribute, roughness. In contrast to the received view that roughness is irrelevant for communication [6], our data reveal that the acoustic space occupied by the rough vocal regime is segregated from other signals, including speech, a pre-requisite to avoid false alarms in normal vocal communication. We show that roughness is present in natural alarm signals as well as in artificial alarms and that the presence of roughness in sounds boosts their detection in various tasks. Using fMRI, we show that acoustic roughness engages subcortical structures critical to rapidly appraise danger. Altogether, these data demonstrate that screams occupy a privileged acoustic niche that, being separated from other communication signals, ensures their biological and ultimately social efficiency.


MC1: Neuron. 2013 Mar 6;77(5):980-91.

Mechanisms underlying selective neuronal tracking of attended speech at a "cocktail party".

Zion Golumbic EM, Ding N, Bickel S, Lakatos P, Schevon CA, McKhann GM, Goodman RR, Emerson R, Mehta AD, Simon JZ, Poeppel D, Schroeder CE.

The ability to focus on and understand one talker in a noisy social environment is a critical social-cognitive capacity, whose underlying neuronal mechanisms are unclear. We investigated the manner in which speech streams are represented in brain activity and the way that selective attention governs the brain's representation of speech using a Cocktail Party paradigm, coupled with direct recordings from the cortical surface in surgical epilepsy patients. We find that brain activity dynamically tracks speech streams using both low-frequency phase and high-frequency amplitude fluctuations and that optimal encoding likely combines the two. In and near low-level auditory cortices, attention modulates the representation by enhancing cortical tracking of attended speech streams, but ignored speech remains represented. In higher-order regions, the representation appears to become more selective, in that there is no detectable tracking of ignored speech. This selectivity itself seems to sharpen as a sentence unfolds.


MC2: Elife. 2016 Mar 7;5:e11476.

Neural signatures of perceptual inference.

Sedley W, Gander PE, Kumar S, Kovach CK, Oya H, Kawasaki H, Howard MA, Griffiths TD.

Generative models, such as predictive coding, posit that perception results from a combination of sensory input and prior prediction, each weighted by its precision (inverse variance), with incongruence between these termed prediction error (deviation from prediction) or surprise (negative log probability of the sensory input). However, direct evidence for such a system, and the physiological basis of its computations, is lacking. Using an auditory stimulus whose pitch value changed according to specific rules, we controlled and separated the three key computational variables underlying perception, and discovered, using direct recordings from human auditory cortex, that surprise due to prediction violations is encoded by local field potential oscillations in the gamma band (>30 Hz), changes to predictions in the beta band (12-30 Hz), and that the precision of predictions appears to quantitatively relate to alpha band oscillations (8-12 Hz). These results confirm oscillatory codes for critical aspects of generative models of perception.


MC3: J Neurosci. 2016 Sep 21;36(38):9888-95.

Eye Can Hear Clearly Now: Inverse Effectiveness in Natural Audiovisual Speech Processing Relies on Long-Term Crossmodal Temporal Integration.

Crosse MJ, Di Liberto GM, Lalor EC.

Speech comprehension is improved by viewing a speaker's face, especially in adverse hearing conditions, a principle known as inverse effectiveness. However, the neural mechanisms that help to optimize how we integrate auditory and visual speech in such suboptimal conversational environments are not yet fully understood. Using human EEG recordings, we examined how visual speech enhances the cortical representation of auditory speech at a signal-to-noise ratio that maximized the perceptual benefit conferred by multisensory processing relative to unisensory processing. We found that the influence of visual input on the neural tracking of the audio speech signal was significantly greater in noisy than in quiet listening conditions, consistent with the principle of inverse effectiveness. Although envelope tracking during audio-only speech was greatly reduced by background noise at an early processing stage, it was markedly restored by the addition of visual speech input. In background noise, multisensory integration occurred at much lower frequencies and was shown to predict the multisensory gain in behavioral performance at a time lag of ∼250 ms. Critically, we demonstrated that inverse effectiveness, in the context of natural audiovisual (AV) speech processing, relies on crossmodal integration over long temporal windows. Our findings suggest that disparate integration mechanisms contribute to the efficient processing of AV speech in background noise.


MC4: J Neurosci. 2016 Sep 14;36(37):9572-9

Transitional Probabilities Are Prioritized over Stimulus/Pattern Probabilities in Auditory Deviance Detection: Memory Basis for Predictive Sound Processing.

Mittag M, Takegata R, Winkler I.

Representations encoding the probabilities of auditory events do not directly support predictive processing. In contrast, information about the probability with which a given sound follows another (transitional probability) allows predictions of upcoming sounds. We tested whether behavioral and cortical auditory deviance detection (the latter indexed by the mismatch negativity event-related potential) relies on probabilities of sound patterns or on transitional probabilities. We presented healthy adult volunteers with three types of rare tone-triplets among frequent standard triplets of high-low-high (H-L-H) or L-H-L pitch structure: proximity deviant (H-H-H/L-L-L), reversal deviant (L-H-L/H-L-H), and first-tone deviant (L-L-H/H-H-L). If deviance detection was based on pattern probability, reversal and first-tone deviants should be detected with similar latency because both differ from the standard at the first pattern position. If deviance detection was based on transitional probabilities, then reversal deviants should be the most difficult to detect because, unlike the other two deviants, they contain no low-probability pitch transitions. The data clearly showed that both behavioral and cortical auditory deviance detection uses transitional probabilities. Thus, the memory traces underlying cortical deviance detection may provide a link between stimulus probability-based change/novelty detectors operating at lower levels of the auditory system and higher auditory cognitive functions that involve predictive processing.


MC5: Nat Neurosci. 2016 Jan;19(1):158-64.

Cortical tracking of hierarchical linguistic structures in connected speech.

Ding N, Melloni L, Zhang H, Tian X, Poeppel D

The most critical attribute of human language is its unbounded combinatorial nature: smaller elements can be combined into larger structures on the basis of a grammatical system, resulting in a hierarchy of linguistic units, such as words, phrases and sentences. Mentally parsing and representing such structures, however, poses challenges for speech comprehension. In speech, hierarchical linguistic structures do not have boundaries that are clearly defined by acoustic cues and must therefore be internally and incrementally constructed during comprehension. We found that, during listening to connected speech, cortical activity of different timescales concurrently tracked the time course of abstract linguistic structures at different hierarchical levels, such as words, phrases and sentences. Notably, the neural tracking of hierarchical linguistic structures was dissociated from the encoding of acoustic cues and from the predictability of incoming words. Our results indicate that a hierarchy of neural processing timescales underlies grammar-based internal construction of hierarchical linguistic structure.


YB1: Neuron. 2012 Oct 18;76(2):435-49.

Discrete neocortical dynamics predict behavioral categorization of sounds.

Bathellier B, Ushakova L, Rumpel S.

The ability to group stimuli into perceptual categories is essential for efficient interaction with the environment. Discrete dynamics that emerge in brain networks are believed to be the neuronal correlate of category formation. Observations of such dynamics have recently been made; however, it is still unresolved if they actually match perceptual categories. Using in vivo two-photon calcium imaging in the auditory cortex of mice, we show that local network activity evoked by sounds is constrained to few response modes. Transitions between response modes are characterized by an abrupt switch, indicating attractor-like, discrete dynamics. Moreover, we show that local cortical responses quantitatively predict discrimination performance and spontaneous categorization of sounds in behaving mice. Our results therefore demonstrate that local nonlinear dynamics in the auditory cortex generate spontaneous sound categories which can be selected for behavioral or perceptual decisions.


YB2: J Neurosci. 2004 Nov 17;24(46):10440-53.

Multiple time scales of adaptation in auditory cortex neurons.

Ulanovsky N, Las L, Farkas D, Nelken I.

Neurons in primary auditory cortex (A1) of cats show strong stimulus-specific adaptation (SSA). In probabilistic settings, in which one stimulus is common and another is rare, responses to common sounds adapt more strongly than responses to rare sounds. This SSA could be a correlate of auditory sensory memory at the level of single A1 neurons. Here we studied adaptation in A1 neurons, using three different probabilistic designs. We showed that SSA has several time scales concurrently, spanning many orders of magnitude, from hundreds of milliseconds to tens of seconds. Similar time scales are known for the auditory memory span of humans, as measured both psychophysically and using evoked potentials. A simple model, with linear dependence on both short-term and long-term stimulus history, provided a good fit to A1 responses. Auditory thalamus neurons did not show SSA, and their responses were poorly fitted by the same model. In addition, SSA increased the proportion of failures in the responses of A1 neurons to the adapting stimulus. Finally, SSA caused a bias in the neuronal responses to unbiased stimuli, enhancing the responses to eccentric stimuli. Therefore, we propose that a major function of SSA in A1 neurons is to encode auditory sensory memory on multiple time scales. This SSA might play a role in stream segregation and in binding of auditory objects over many time scales, a property that is crucial for processing of natural auditory scenes in cats and of speech and music in humans.


YB3: Nat Neurosci. 2003 Apr;6(4):391-8.

Processing of low-probability sounds by cortical neurons.

Ulanovsky N, Las L, Nelken I.

The ability to detect rare auditory events can be critical for survival. We report here that neurons in cat primary auditory cortex (A1) responded more strongly to a rarely presented sound than to the same sound when it was common. For the rare stimuli, we used both frequency and amplitude deviants. Moreover, some A1 neurons showed hyperacuity for frequency deviants--a frequency resolution one order of magnitude better than receptive field widths in A1. In contrast, auditory thalamic neurons were insensitive to the probability of frequency deviants. These phenomena resulted from stimulus-specific adaptation in A1, which may be a single-neuron correlate of an extensively studied cortical potential--mismatch negativity--that is evoked by rare sounds. Our results thus indicate that A1 neurons, in addition to processing the acoustic features of sounds, may also be involved in sensory memory and novelty detection.


YB4: J Neurosci. 2006 May 3;26(18):4970-82.

Perceptual learning directs auditory cortical map reorganization through top-down influences.

Polley DB, Steinberg EE, Merzenich MM.

The primary sensory cortex is positioned at a confluence of bottom-up dedicated sensory inputs and top-down inputs related to higher-order sensory features, attentional state, and behavioral reinforcement. We tested whether topographic map plasticity in the adult primary auditory cortex and a secondary auditory area, the suprarhinal auditory field, was controlled by the statistics of bottom-up sensory inputs or by top-down task-dependent influences. Rats were trained to attend to independent parameters, either frequency or intensity, within an identical set of auditory stimuli, allowing us to vary task demands while holding the bottom-up sensory inputs constant. We observed a clear double-dissociation in map plasticity in both cortical fields. Rats trained to attend to frequency cues exhibited an expanded representation of the target frequency range within the tonotopic map but no change in sound intensity encoding compared with controls. Rats trained to attend to intensity cues expressed an increased proportion of nonmonotonic intensity response profiles preferentially tuned to the target intensity range but no change in tonotopic map organization relative to controls. The degree of topographic map plasticity within the task-relevant stimulus dimension was correlated with the degree of perceptual learning for rats in both tasks. These data suggest that enduring receptive field plasticity in the adult auditory cortex may be shaped by task-specific top-down inputs that interact with bottom-up sensory inputs and reinforcement-based neuromodulator release. Top-down inputs might confer the selectivity necessary to modify a single feature representation without affecting other spatially organized feature representations embedded within the same neural circuitry.


YB5: Nature. 2011 Dec 7;480(7377):331-5

A disinhibitory microcircuit for associative fear learning in the auditory cortex.

Letzkus JJ, Wolff SB, Meyer EM, Tovote P, Courtin J, Herry C, Lüthi A.

Learning causes a change in how information is processed by neuronal circuits. Whereas synaptic plasticity, an important cellular mechanism, has been studied in great detail, we know much less about how learning is implemented at the level of neuronal circuits and, in particular, how interactions between distinct types of neurons within local networks contribute to the process of learning. Here we show that acquisition of associative fear memories depends on the recruitment of a disinhibitory microcircuit in the mouse auditory cortex. Fear-conditioning-associated disinhibition in auditory cortex is driven by foot-shock-mediated cholinergic activation of layer 1 interneurons, in turn generating inhibition of layer 2/3 parvalbumin-positive interneurons. Importantly, pharmacological or optogenetic block of pyramidal neuron disinhibition abolishes fear learning. Together, these data demonstrate that stimulus convergence in the auditory cortex is necessary for associative fear learning to complex tones, define the circuit elements mediating this convergence and suggest that layer-1-mediated disinhibition is an important mechanism underlying learning and information processing in neocortical circuits.


YB6: Nat Neurosci. 2011 Jan;14(1):108-14

Auditory cortex spatial sensitivity sharpens during task performance.

Lee CC, Middlebrooks JC.

Activity in the primary auditory cortex (A1) is essential for normal sound localization behavior, but previous studies of the spatial sensitivity of neurons in A1 have found broad spatial tuning. We tested the hypothesis that spatial tuning sharpens when an animal engages in an auditory task. Cats performed a task that required evaluation of the locations of sounds and one that required active listening, but in which sound location was irrelevant. Some 26-44% of the units recorded in A1 showed substantially sharpened spatial tuning during the behavioral tasks as compared with idle conditions, with the greatest sharpening occurring during the location-relevant task. Spatial sharpening occurred on a scale of tens of seconds and could be replicated multiple times in ∼1.5-h test sessions. Sharpening resulted primarily from increased suppression of responses to sounds at least-preferred locations. That and an observed increase in latencies suggest an important role of inhibitory mechanisms.


YB7: J Neurosci. 2012 Feb 29;32(9):3193-210

Activity related to perceptual judgment and action in primary auditory cortex.

Niwa M, Johnson JS, O'Connor KN, Sutter ML.

Recent evidence is reshaping the view of primary auditory cortex (A1) from a unisensory area to one more involved in dynamically integrating multisensory- and task-related information. We found A1 single- (SU) and multiple-unit (MU) activity correlated with macaques' choices in an amplitude modulation (AM) discrimination task. Animals were trained to discriminate AM noise from unmodulated noise by releasing a lever for AM noise and holding down the lever for unmodulated noise. Activity for identical stimuli was compared between trials where the animals reported AM and trials where they did not. We found 47.4% of MUs and 22.8% of SUs significantly increased firing shortly before the animal's behavioral response to report AM when compared to the equivalent time period on trials where AM was not reported. Activity was also linked to lever release in a different task context, suggesting A1 modulation by somatosensory, or efference copy, input. When spikes were counted only during the stimulus, 19.6% of MUs and 13.8% of SUs increased firing rate when animals reported AM compared to when they did not, suggesting an attentional effect, or that A1 activity can be used by higher decision areas, or that such areas provide feedback to A1. Activity associated with AM reporting was correlated with a unit's AM sensitivity, suggesting AM sensitive neurons' involvement in task performance. A1 neurons' phase locking to AM correlated more weakly (compared to firing rate) with the animals' report of AM, suggesting a preferential role for rate-codes in A1 for this AM discrimination task.


YB8: Nat Neurosci. 2009 May;12(5):646-54.

Engaging in an auditory task suppresses responses in auditory cortex.

Otazu GH, Tai LH, Yang Y, Zador AM.

Although systems that are involved in attentional selection have been studied extensively, much less is known about nonselective systems. To study these preparatory mechanisms, we compared activity in auditory cortex that was elicited by sounds while rats performed an auditory task ('engaged') with activity that was elicited by identical stimuli while subjects were awake but not performing a task ('passive'). We found that engagement suppressed responses, an effect that was opposite in sign to that elicited by selective attention. In the auditory thalamus, however, engagement enhanced spontaneous firing rates but did not affect evoked responses. These results indicate that neural activity in auditory cortex cannot be viewed simply as a limited resource that is allocated in greater measure as the state of the animal passes from somnolent to passively listening to engaged and attentive. Instead, the engaged condition possesses a characteristic and distinct neural signature in which sound-evoked responses are paradoxically suppressed.


YB9: Neuron. 2014 Jun 4;82(5):1157-70.

Neural correlates of task switching in prefrontal cortex and primary auditory cortex in a novel stimulus selection task for rodents.

Rodgers CC, DeWeese MR.

Animals can selectively respond to a target sound despite simultaneous distractors, just as humans can respond to one voice at a crowded cocktail party. To investigate the underlying neural mechanisms, we recorded single-unit activity in primary auditory cortex (A1) and medial prefrontal cortex (mPFC) of rats selectively responding to a target sound from a mixture. We found that prestimulus activity in mPFC encoded the selection rule-which sound from the mixture the rat should select. Moreover, electrically disrupting mPFC significantly impaired performance. Surprisingly, prestimulus activity in A1 also encoded selection rule, a cognitive variable typically considered the domain of prefrontal regions. Prestimulus changes correlated with stimulus-evoked changes, but stimulus tuning was not strongly affected. We suggest a model in which anticipatory activation of a specific network of neurons underlies the selection of a sound from a mixture, giving rise to robust and widespread rule encoding in both brain regions.


YB10: J Neurosci. 2011 Aug 17;31(33):11867-78.

Extra-classical tuning predicts stimulus-dependent receptive fields in auditory neurons.

Schneider DM, Woolley SM.

The receptive fields of many sensory neurons are sensitive to statistical differences among classes of complex stimuli. For example, excitatory spectral bandwidths of midbrain auditory neurons and the spatial extent of cortical visual neurons differ during the processing of natural stimuli compared to the processing of artificial stimuli. Experimentally characterizing neuronal nonlinearities that contribute to stimulus-dependent receptive fields is important for understanding how neurons respond to different stimulus classes in multiple sensory modalities. Here we show that in the zebra finch, many auditory midbrain neurons have extra-classical receptive fields, consisting of sideband excitation and sideband inhibition. We also show that the presence, degree, and asymmetry of stimulus-dependent receptive fields during the processing of complex sounds are predicted by the presence, valence, and asymmetry of extra-classical tuning. Neurons for which excitatory bandwidth expands during the processing of song have extra-classical excitation. Neurons for which frequency tuning is static and for which excitatory bandwidth contracts during the processing of song have extra-classical inhibition. Simulation experiments further demonstrate that stimulus-dependent receptive fields can arise from extra-classical tuning with a static spike threshold nonlinearity. These findings demonstrate that a common neuronal nonlinearity can account for the stimulus dependence of receptive fields estimated from the responses of auditory neurons to stimuli with natural and non-natural statistics.


YB11: Nat Neurosci. 2014 Jun;17(6):841-50.

Scaling down of balanced excitation and inhibition by active behavioral states in auditory cortex.

Zhou M, Liang F, Xiong XR, Li L, Li H, Xiao Z, Tao HW, Zhang LI.

Cortical sensory processing is modulated by behavioral and cognitive states. How this modulation is achieved by changing synaptic circuits remains largely unknown. In awake mouse auditory cortex, we found that sensory-evoked spike responses of layer 2/3 (L2/3) excitatory cells were scaled down with preserved sensory tuning when mice transitioned from quiescence to active behaviors, including locomotion, whereas L4 and thalamic responses were unchanged. Whole-cell voltage-clamp recordings revealed that tone-evoked synaptic excitation and inhibition exhibited a robust functional balance. The change to active states caused scaling down of excitation and inhibition at approximately equal levels in L2/3 cells, but resulted in no synaptic changes in L4 cells. This lamina-specific gain control could be attributed to an enhancement of L1-mediated inhibitory tone, with L2/3 parvalbumin inhibitory neurons also being suppressed. Thus, L2/3 circuits can adjust the salience of output in accordance with momentary behavioral demands while maintaining the sensitivity and quality of sensory processing.