Dataset
The official dataset of the challenge
To support model development, we have curated the SongEval dataset—an open-source benchmark containing 2,399 full-length songs (approx. 140.3 hours of generated song) from a variety of genres and languages. These songs have been annotated across five aesthetic dimensions:
- Overall Coherence
- Memorability
- Naturalness of Vocal Breathing and Phrasing
- Clarity of Song Structure
- Overall Musicality
The dataset covers a wide range of genres including Pop, Rock, Jazz, Hip-hop, Classical, and more. It includes songs in both English and Mandarin Chinese, making it a diverse resource for training models. For more details concerning the dataset, we refer to dataset paper.