Music Source Restoration Challenge

Challenge Results

The MSR Challenge 2025 has concluded; we thank all participants for their submissions! Below are the final results. We will be sharing a challenge overview paper soon!

Overall Rankings

Objective Ranking

xlancelab
CUPAudioGroup
AC_DC
Hachimi
cp-jku

Subjective Ranking

xlancelab
CUPAudioGroup
Hachimi
AC_DC
cp-jku

Detailed Results

Objective Results

Team	Overall MMSNR	Overall Zimt	Overall FAD	MMSNR Rank	Zimt Rank	FAD Rank	Macro Rank
xlancelab	4.4623	0.0137	0.1988	1	1	1	1.00
CUPAudioGroup	2.3405	0.0164	0.2253	2	2	2	2.00
AC_DC	1.4520	0.0182	0.2907	4	3	3	3.33
Hachimi	2.0016	0.0183	0.2939	3	4	4	3.67
cp-jku	0.8329	0.0189	0.3814	5	5	5	5.00

Subjective Results (MOS)

System	MOS Sep	MOS Rest	MOS Overall	Sep Rank	Rest Rank	Overall Rank	Macro Rank
xlancelab	4.2358	3.3892	3.4665	1	1	1	1.00
CUPAudioGroup	3.8360	2.9173	2.9253	2	2	2	2.00
Hachimi	3.5814	2.6331	2.7235	3	3	3	3.00
AC_DC	3.5425	2.4768	2.5412	5	4	4	4.33
cp-jku	3.5510	2.0838	2.1414	4	5	5	4.67

System Descriptions

xlancelab

xlancelab employs sequential BS-Roformers using pretrained models from the ZFTurbo MSS-Training repository (Roformer-SW, dereverb, denoise). Training uses L1 loss and multi-resolution STFT loss. Data: MoisesDB and manually cleaned RawStems.

CUPAudioGroup

CUPAudioGroup uses an ensemble of BSRNN, BSRoformer, and MDX23. Pretrained parameters are sourced from the open-source "Music-Source-Separation-Training" project (ZFTurbo). Data: RawStems, MUSDB18-HQ, MoisesDB.

AC/DC

AC/DC submits DTT-BSR, a generator based on DTTNet (a dual-path TFC-TDF U-Net). The architecture incorporates Band-Sequence Modeling from BandSplitRNN to model subband and temporal correlations, and uses a RoPE Transformer Bottleneck for long-sequence handling and phase preservation. Training uses a Multi-Frequency Discriminator with multi-scale mel reconstruction loss, LSGAN adversarial loss, and feature-matching loss. Data: RawStems at 48kHz.

Hachimi

Hachimi uses the "Max version of the backbone" from their recent work. Training combines reconstruction loss and GAN loss. Data: MUSDB25, MUSDB18-HQ, MoisesDB, MedleyDB, RawStems, URMP, MAESTRO.

cp-jku

cp-jku proposes a two-stage pipeline: separation followed by restoration. Separation uses BandSplit-RoFormer to extract eight stems (including "other"). Restoration uses the HiFi++ GAN bundle (SpectralUNet, Upsampler, WaveUNet, SpectralMaskNet). The separator is trained in three stages with LoRA fine-tuning. The restorer is trained in five stages, producing eight source-specific expert models. Data: MUSDB18, DSD100, MoisesDB, Slakh2100, MedleyDB v2, RawStems, and others.