Dataset

The official dataset of the challenge

Track 1: Emotional Intelligence
We will release a dataset consisting of hundreds of hours of human-recorded audio, including both single-turn emotional speech clips and multi-turn emotional dialogue sessions. Beyond covering a rich variety of emotional categories, the dataset features fine-grained annotations for two critical dimensions: the dynamic shifts of emotions throughout the dialogue, and the key underlying causes that trigger these emotional changes.

Track 2: Full-Duplex Interaction
We will provide multi-turn Chinese and English dialogue data from real recordings, covering typical scenarios such as speech interruptions and recognition rejection. Accompanied by strict annotations, this dataset will be used to comprehensively evaluate participating systems in three core aspects: response speed, behavioral rationality, and linguistic naturalness.

The dataset is designed to cover the core scenarios of emotional intelligence and full-duplex interaction, ensuring diversity and authenticity to comprehensively evaluate the performance of participating models. It includes dialogue scenes in both Chinese and English, covering a wide range of emotional and conversational contexts.
For each task in the challenge, we will provide a dedicated set of real-world recorded speech data to serve as the train set and test set. These datasets are collected from natural, human-human or human-machine interactions to ensure authenticity and cover diverse scenarios aligned with the respective tasks.
In addition, we will release the complete data generation pipeline, enabling participants to reproduce or extend the synthetic dataset if desired. Participants are also free to use any publicly available speech or text datasets to train or fine-tune their models, provided they do not use any private or unauthorized data sources.
All participants are strictly prohibited from using any part of the official test set for training or parameter tuning. The use of test labels or any test data leakage will result in disqualification.

Human-like-Spoken-Dialogue-Systems-Challenge - ICASSP 2026