Paris Workshop on Decoding of Sound and Brain

Paris, 5-6 November 2015

Followup:

Jonathan Simon's slides & references are at http://www.isr.umd.edu/Labs/CSSL/simonlab/Publications.html.
Source code for Stephane Mallat's reconstruction examples is at https://github.com/lostanlen/decoding-workshop-2015.
Alexandre Gramfort's slides are at https://dl.dropboxusercontent.com/u/2140486/workshop_decoding_ens_2015.pdf.
Chris Honey's slides are at https://www.dropbox.com/s/1t58lkywr34ndjr/Speech_Timescales_ENS_Paris_2015_v2.pdf?dl=0.
Michael Tangermann's slides are at /https://www.dropbox.com/s/g09q3vogzye7usx/Paris_DecodingSoundBrain_2015_11_05_short.pdf?dl=0.
Malcolm Slaney's toolbox slide are at https://www.dropbox.com/s/7c367vxoa462qg4/Telluride%20Decoding%20Toolbox%20Announcement.pptx?dl=0.
Benjamin Blankhertz's slides: https://owncloud.tu-berlin.de/public.php?service=files&t=6bd3debc4d8fc71ced0af31185eb186b&download, data-sets: http://bnci-horizon-2020.eu/database/data-sets.
Thomas Andrillon's slides: https://www.dropbox.com/s/wvuq1x55wjl4ucu/Andrillon_DecodingWkshop_WNL.pdf?dl=0.
Christophe Micheyl's overview: https://www.dropbox.com/s/g0b3wjozw7mgutp/BrainDecodingSpeechAttention_Overview.pptx?dl=0
Martin McKinney's overview of Ed's talk: https://www.dropbox.com/s/g5ck2t9p0cyugcx/Overview%20of%20Ed%20Lalor%20Talk.pptx?dl=0
Jeff Cruckley's overview of Maarten's talk: https://www.dropbox.com/s/edr942ei6a2ryjl/Paris_Maarten_de_Vos.pptx?dl=0
Photos: here, here, here.

At the Centre Culturel Irlandais. Organized by Alain de Cheveigné, Shihab Shamma, Malcolm Slaney, Daniel Wong, Daniel Pressnitzer.

Map, Hotel des Grandes Ecoles, Hotel des Nations St Germain, Hotel St Christophe. Transportation: from the airport Charles de Gaulle - Roissy, take the RER (suburban train) to Luxembourg station (all RER trains from the airport go there). From there you can walk (don't forget the map!) to your hotel or take a cab. Alternatively you can get off at Saint Michel station and change to the metro, direction Austerlitz, to the station Cardinal Lemoine. Instructions from Gare du Nord train station and Orly airport are similar.

Ongoing sounds at our ears affect the ongoing signals in our brain. Our workshop aims at understanding how the two are related. The workshop will bring together: (a) people working on brain activity, in particular in response to sound, (b) people working on audio signal analysis, in particular long term and higher order structure, (c) experts in relevant signal processing and machine learning techniques, (d) people working on applications such as brain-computer interfaces (BCI). The aim is to explore new ideas and tools for joint analysis of sound and brain response patterns.

Invited discussants: Dorothée Arzounian, Nicolas Barascud, Daniel Bates, Adam Bednar, Laurent Bonnasse-Gahot, Michael Crosse, Jeff Cruckley, Torsten Dau, Giovanni Di Liberto, Emmanuel Dupoux, Søren Fuglsang, Hynek Hermansky, Jens Hjortkjær, Radu Horaud, Shih-Chii Liu, Thomas Lunner, Huan Luo, Martin McKinney, Nima Mesgarani, Christophe Micheyl, Jean-Pierre Nadal, Israel Nelken, Ulrich Pomper, Oiwi Parker Jones, Gaël Richard, Romain Serizel, Naftali Tishby, Anthony Wakulicz, Virginie van Wassenhove
Speakers: Thomas Andrillon, Benjamin Blankertz, Maarten de Vos, Mounya Elhilali, Alexandre Gramfort, Sophie Herbst, Chris Honey, Sid Kouider, Ed Lalor, Stéphane Mallat, Sylvie Nozaradan, Jonathan Simon, Malcolm Slaney, Michael Tangermann.

Tentative schedule (subject to change):

Thursday 5:

Friday 6:

9:30	INTRODUCTION
9:50	(50') Jonathan Simon
10:40	(30') COFFEE
11:10	(50') Thomas Andrillon
12:00	(50') Stéphane Mallat
12:50	(80') LUNCH
14:10	(50') Mounya Elhilali
15:00	(50') Michael Tangermann
15:50	(30') COFFEE
16:20	(50') Benjamin Blankertz
17:10	(50') Malcolm Slaney

9:30	(50') Alexandre Gramfort
10:20	(50') Sylvie Nozaradan
11:10	(30') COFFEE
11:40	(50') Sid Kouider
12:30	(50') Maarten de Vos
13:20	(80') LUNCH
14:40	(50') Sophie Herbst
15:30	(50') Christopher Honey
16:20	(20') COFFEE
16:40	(50') Ed Lalor
17:30	END

Each 50' slot comprises a 20-30' minute talk and a 30-20' discussion.

Talkers, titles and abstracts :

Thomas Andrillon
Learning noise: how the brain catches acoustic regularities and its implications on auditory processing
Experience continuously imprints on the brain at all stages of life. The traces it leaves behind can produce perceptual learning, which drives adaptive behavior to previously encountered stimuli. Recently, it has been shown that even random noise, a type of sound devoid of acoustic structure, can trigger fast and robust perceptual learning after repeated exposure. Yet the mechanisms underlying such baffling form of learning are unclear. In particular, what is exactly learnt within these repeated noise snippets? We recently found that noise-learning was associated with the emergence of evoked potentials despite the absence of salient acoustic features. Interestingly, these evoked potentials were similar to classical auditory potentials and randomly distributed in time, suggesting an idiosyncratic and local learning. We also showed that such form of learning was possible under condition of diverted attention or even in certain sleep-stages. But how can such fragments of noise be learn? Despite its complexity, noise-learning can be implemented with simple plasticity rules such as Spike-Timing-Dependent Plasticity (STDP). It has been shown that in silico networks endowed with STDP almost inevitably learn recurrent random inputs. The automaticity of such learning mechanisms could explain why noise can be learnt without attention or awareness. Finally, our results could also be relevant for more natural sounds. The formation of a sharp selectivity to rapidly learnt features could aid for example source discrimination. However, this source identification could be shaped as much by idiosyncratic experience as by acoustic properties when the noise level is high.
Benjamin Blankertz
Analyzing continuous brain signals reflecting music listening
Broadening the techniques beyond event-related analysis, this talk will present multivariate methods for the analysis of neural processes during continuous music listening. Using a regression-based approach to optimize user-specific spatial filters, it is possible to obtain a mapping between the continuous brain signals and the music signal. It allows to reconstruct certain aspects of monophonic musical stimuli from the listener's EEG. Furthermore, we explore the possibility to reveal separate neural correlates of the individual streams in polyphonic music.
Maarten de Vos
Towards inobtrusive real-life decoding of auditory attention
Traditionally, the electroencephalogram (EEG) is recorded with expensive hardware in a highly controlled lab environment. In order to enable brain monitoring in daily life situations, we built a small, lightweight and wireless system that can be steered with Android software. This makes the concept of fully mobile EEG-based brain monitoring and real-life Brain Computer Interfaces (BCI) very realistic. We will demonstrate that we can reliably decode auditory attention in natural conditions (e.g. walking or cycling), and will also show preliminary results on decoding auditory attention from cEEGrid data, which is an unobtrusive way of recording EEG data around the ear.
Mounya Elhilali
Investigating auditory bottom-up attention
n everyday life, we are surrounded by a cacophony of sounds that our brain has to sort through in order to focus on important information. The conspicuity of certain sound tokens relative to the scene allow them to emerge as separate within the context of their surrounding sounds. In this work, we hypothesize that as sound statistics change over time, variance from these statistics drive bottom-up attention processes that direct our focus to certain objects in the scene. We investigate the perceptual space that renders certain sounds salient relative to their context, and examine neural underpinnings of emergence of these salient sounds using complex acoustic scenes as stimuli. Results reveal adaptive neural representations of the context and salient objects reflecting deviations from the statistical structure of the scene. We speculate that principles of predictive coding could explain a number of observed neural and perceptual findings and discuss relevance of such framework in understanding processes of auditory scene analysis.
Alexandre Gramfort
Temporal decoding of perceptual thresholds with M/EEG
Over the last decade, supervised learning – commonly referred to as decoding in the field of neuroscience – has emerged as a new tool to detect statistical effects in experimental data. The use of decoding techniques has first appeared in functional MRI, coined as Multi-Variate (or Multi-Voxel) Pattern Analysis (MVPA). Over the last few years, MEG and EEG research has caught up on this trend leveraging the ability of revealing the fast temporal dynamics of neural processes. This talk aims to give a general introduction to MEG decoding methods, and present some recent work in the field of neuroscience where MEG decoding has proven to be successful. I will then present more in depth some recent results related to the decoding of perceptual thresholds in an audio visual task involving coherent and incoherent visual motion. Due to the ordinal nature of the coherence levels parametrizing the stimuli, a novel decoding approach going beyond standard classification and regression will be introduced. Finally, a brief demo of MEG decoding with the MNE software will be presented.
Sophie Herbst
Variations in implicit temporal predictability are encoded in the electroencephalogram
The human brain automatically extracts temporal contingencies from the environment - but are these used to form temporal predictions even in strictly implicit timing scenarios? Using electroencephalography (EEG), we here studied the neural mechanisms of temporally predictive processing in an auditory foreperiod paradigm combined with a forward encoding model of temporal hazard. Unbeknownst to participants (N=22), we induced a probabilistic variation of foreperiods in a pitch-discrimination task on a noise- embedded tone. Foreperiods were block-wise either drawn from a uniform distribution, yielding a monotonically increasing hazard of tone occurrence (nonpredictive), or from a normal distribution, yielding a mixture of increasing hazard with a peak in occurrence probability (predictive). Although predictability manipulations were not detected, unpredictable foreperiods yielded slower response times. In EEG, predictability resulted in enhanced delta (0.5--3 Hz) phase coherence over posterior channels prior to tone onset. We then constructed a forward encoding model, using as trial-wise regressors the two different, hypothesized hazard functions. The fit between modeled and measured time-domain EEG signals allowed us to quantify representation of temporal hazard in the EEG. The nonpredictive, monotonically increasing hazard function was reflected in the EEG signal in all conditions, while the predictive-condition hazard function was encoded relatively best in the predictive-condition EEG signals. This is the first attempt to quantify implicit temporal predictability from EEG data using a forward encoding model. Our data show that even if participants are unaware of temporal contingencies in their environment, these are used to form temporal predictions.
Chris Honey
Multi-scale processing of natural speech in distributed brain networks
Natural speech has a nested temporal structure. This nested structure -- in which each phoneme unfolds in the context of a word, each word unfolds in the context of a sentence, and each sentence unfolds in the context of a discourse -- has functional and physiological consequences for how our brains process speech. I will suggest that, in order to integrate information across the nested levels of spoken language, neural circuits are organized in a distributed and hierarchical manner, with regions at consecutive levels of the hierarchy exhibiting (i) longer processing timescales, i.e. more "process memory" and (ii) more temporally autocorrelated dynamics. This model makes predictions for how neural circuits respond to continuous and interrupted auditory narratives, and also for the effects of language expertise.
Sid Kouider
EEG decoding of auditory attention during sleep and mind-wandering
Ed Lalor
Studying natural speech processing at the phonemic level using EEG
In recent years it has been firmly established that EEG and MEG reliably entrain to the amplitude envelope of natural speech stimuli. This has facilitated the development of exciting new paradigms for investigating the neural mechanisms underlying natural speech processing. However, it has been unclear whether this envelope entrainment phenomenon simply reflects the lower-level passive following of the spectrotemporal/acoustic stimulus dynamics or whether it indexes something specifically to do with speech processing. In particular, there has been no evidence that EEG or MEG entrainment reflects processing at the level of categorical speech perception. In this talk I will attempt to convince you that EEG is sensitive not just to the low-level acoustic properties of speech, but also to higher-level phonetic features of this most important of signals. And I will outline a number of paradigms and methodological approaches for eliciting EEG indices of speech-specific processing that should be useful in advancing our understanding of receptive speech processing in particular populations.
Stephane Mallat
Listening to deep network geometry and the high-level mystery
Deep convolution networks are winners of many audio classification problems. Over short time intervals below 200ms, these architecture provide geometric signal representations, which are connected to audio representations such as MFCC and cortical transforms. They have strong similarities with representations of image geometry. We review their mathematical properties, illustrated by applications to classifications, synthesis and source separation. At larger time scales beyond 500ms, we are entering a wild audio world, with many algorithms, little mathematics, and mostly open questions that we shall discuss.
Sylvie Nozaradan
From input-output transforms to perceptual templates: neural encoding of rhythms investigated with human depth-electrodes recording
The talk will present studies aiming at characterizing the mechanisms that allow humans to entrain the mind and body to incoming rhythmic sensory inputs in real time. To this aim, the transformation between rhythmic inputs and neural outputs were estimated using a frequency-tagging approach, a method increasingly used to investigate rhythm perception in a variety of contexts such as infants, brain damaged patients, or non-human mammals. Results from human intracerebral EEG recording in the temporal and frontal cortices will be presented, in regard to recent results gathered with surface EEG. Based on these results, the observed input-output transforms could be explained in part by properties of the ascending auditory pathway. However, the link between these transforms and perceptual templates of rhythms seem to actively involve auditory-motor cortical processing. Together, these results inform on the neural encoding of sounds in humans as performed in various cortical hubs.
Jonathan Simon
Neural Representations of the Cocktail Party in Human Auditory Cortex
An auditory scene is perceived in terms of its constituent auditory objects, which in a "Cocktail Party" scenario correspond to individual speech streams. Here, we investigate how these auditory objects are individually represented in human auditory cortex, using magnetoencephalography (MEG) to record the neural responses of listeners to speech streams in a variety of auditory scenes. In one experiment, subjects selectively listen to one of two competing speakers mixed into a single channel. Individual neural representations of the speech of each speaker are observed, with each being selectively phase locked to the rhythm of the corresponding speech stream, and from which can be exclusively reconstructed the temporal envelope of that speech stream. The neural representation of the attended speech, originating in posterior auditory cortex, dominates the responses. Critically, when the intensities of the attended and background speakers are separately varied over a wide intensity range, the neural representation of the attended speech adapts only to the intensity of that speaker, but not to the intensity of the background speaker. Additional acoustic scenes investigated include speech masked by noise, including the case where the sounds are processed by (simulated) cochlear implants. Overall, these results indicate that concurrent auditory objects, even if spectrally overlapping and not resolvable at the auditory periphery, are indeed neurally separated and encoded individually as objects, in higher order auditory cortex.
Malcolm Slaney
The Telluride Decoding Toolbox
Michael Tangerman
Novel Applications for Auditory BCIs
As BCI neurotechnology allows for single-trial analysis of attentional processes, it provides interesting tools for various clinical applications. In the first part of my talk, I will briefly review the latest research on auditory BCI approaches, with a special emphasis on applications outside the usual range of communication and control. In the second half of my talk, I will present an auditory ERP-based paradigm, which is intended for the use in rehabilitation training of stroke patients with word production deficits.

This workshop is supported by the COCOHA project, European Union's Horizon 2020 research and innovation programme, grant agreement No 644732, and by the PSL-UCL Collaborative Grant awarded by PSL.

[Audition] [LSP] [IEC] [ENS] [CNRS] [PSL]