Interspeech 2025 Special Session: Biosignal-enabled Spoken Communication

Organizers

  • Kevin Scheck (University of Bremen, Germany)
  • Dr. Siqi Cai (National University of Singapore, Singapore)
  • Prof. Tanja Schultz (University of Bremen, Germany)
  • Prof. Satoshi Nakamura (Nara Institute of Science and Technology, Japan; The Chinese University of
    Hong Kong, Shenzhen, China)
  • Prof. Haizhou Li (National University of Singapore, Singapore; The Chinese University of
    Hong Kong, Shenzhen, China)

Introduction

The main topic of this special session is speech-related biosignals, such as of articulatory or neurological activities during speech production or perception. For speech production, biosignals such as Electromagnetic Articulography (EMA) or Electromyography (EMG) provide information about articulatory trajectories or related muscle activity. Biosignals of neural activity, such as Electroencephalography (EEG) or Electrocorticography (ECoG), are also investigated in the context of speech production and perception. By analyzing these biosignals, researchers gain insights into the mechanisms underlying speech processes and can also explore individual differences in how speech processing tasks are performed.

Moreover, biosignals can serve as alternative modalities to the acoustic signal for speech-driven systems, enabling novel speech communication devices. For instance, silent speech interfaces are being developed to restore the ability of spoken communication for speech-impaired individuals, e.g., after a laryngectomy, by converting speech-related biosignals into intelligible speech. Likewise, biosignals related to speech perception are being explored for neuro-steered hearing aids. Progress in this field will lead to the design of novel biosignal-enabled assistive speech technologies, such as voice prostheses or hearing aids. With the special session “Biosignal-enabled Spoken Communication”, we aim to bring together researchers working on biosignals and speech processing to exchange ideas on interdisciplinary topics.

Topics

Topics of interest for this special session include, but are not limited to:

  • Processing of speech-related biosignals, such as of articulatory activity, captured by, e.g., Electromagnetic Articulography (EMA), Electromyography (EMG), Ultrasound Tongue Imagining (UTI), High Speed Nasopharyngoscopy (HSN), or from neural activity, measured by, e.g., Electroencephalography (EEG), Magnetoencephalography (MEG), Electrocorticography (ECoG), or functional magnetic resonance imaging
    (fMRI). Further biosignals can stem from respiratory, laryngeal, or other speech-related activities.
  • Analysis of biosignals for evaluating and/or explaining individual differences in human speech processing tasks, such as speech perception and production.
  • Integrating biosignals as an additional modality to acoustic speech processing systems for increasing their general performance, user adaptability, or explainability.
  • Usage of biosignals in speech processing tasks, e.g., speech recognition, synthesis, enhancement, voice conversion, and auditory attention detection.
  • Implementing self-supervised pre-training of biosignal models for downstream speech processing tasks. By utilizing unlabeled biosignal data, these methods create robust feature representations that improve the accuracy of different applications.
  • Development of novel machine learning algorithms, feature representations, model architectures, as well as training and evaluation strategies for biosignal processing.
  • Applications of speech-related biosignal processing, such as speech restoration, training, therapy, or mental health assessments. Further applications include speech-related brain-computer interfaces, voice prostheses, communication in noisy environments, or preserving privacy spoken communication.

Paper Submission and Session Format

Paper submissions must conform to the format defined in the Interspeech 2025 format. When submitting the paper in the Interspeech electronic paper submission system, please indicate that the paper should be included in the Special Session  Biosignal-enabled Spoken Communications. All submissions will take part in the normal paper review process.

The session format will either be a poster session or oral presentations, depending on the number of accepted papers. We will therefore inform participants about the format shortly after the acceptance notification (May 21st,  2025)

Important Dates

Submission portal opened: December 18th, 2024
Paper submission deadline: February 12th, 2025
Paper update deadline: February 19th, 2025
Final list of special sessions: May 14th, 2025
Acceptance notification: May 21st,  2025

Contact

If you have any questions, please contact either Kevin Scheck or Siqi Cai.