Task 1: Emotion Intelligence

The Emotion Intelligence Track aims to evaluate the emotional competence of spoken dialogue systems across five critical dimensions. These dimensions capture how well a system can perceive, interpret, express, and respond to human emotions in interactive scenarios


  • Emotion Recognition: Identifying emotions such as happy, sad, anger, fear, neutral, surprise, and disgust from speech serves as the foundation of emotional intelligence in spoken interactions.
  • Textual Empathy Response: Evaluating the empathy of the system’s response text, ensuring the written content reflects accurate understanding of the user’s emotional state and provides appropriate supportive expression that aligns with the user’s feelings.
  • Auditory Empathetic Expression: Assessing the empathetic emotion conveyed by the system’s response audio, requiring the speech to carry emotional tones that match the user’s mood to achieve emotional resonance.
  • Emotion Causal Reasoning: Inferring the underlying causes of emotions in conversation, helping the model understand the user’s emotional context.
  • Emotion Dynamics Analysis: Detecting dynamic emotional shifts in the user’s speech, iden- tifying significant changes in emotional state throughout the conversation.