Challenge Results

The MSR Challenge 2025 has concluded; we thank all participants for their submissions! Below are the final results. We will be sharing a challenge overview paper soon!

Overall Rankings

Objective Ranking

  1. xlancelab
  2. CUPAudioGroup
  3. AC_DC
  4. Hachimi
  5. cp-jku

Subjective Ranking

  1. xlancelab
  2. CUPAudioGroup
  3. Hachimi
  4. AC_DC
  5. cp-jku

Detailed Results

Objective Results

Team Overall MMSNR Overall Zimt Overall FAD MMSNR Rank Zimt Rank FAD Rank Macro Rank
xlancelab 4.4623 0.0137 0.1988 1 1 1 1.00
CUPAudioGroup 2.3405 0.0164 0.2253 2 2 2 2.00
AC_DC 1.4520 0.0182 0.2907 4 3 3 3.33
Hachimi 2.0016 0.0183 0.2939 3 4 4 3.67
cp-jku 0.8329 0.0189 0.3814 5 5 5 5.00

Subjective Results (MOS)

System MOS Sep MOS Rest MOS Overall Sep Rank Rest Rank Overall Rank Macro Rank
xlancelab 4.2358 3.3892 3.4665 1 1 1 1.00
CUPAudioGroup 3.8360 2.9173 2.9253 2 2 2 2.00
Hachimi 3.5814 2.6331 2.7235 3 3 3 3.00
AC_DC 3.5425 2.4768 2.5412 5 4 4 4.33
cp-jku 3.5510 2.0838 2.1414 4 5 5 4.67

System Descriptions

xlancelab

xlancelab employs sequential BS-Roformers using pretrained models from the ZFTurbo MSS-Training repository (Roformer-SW, dereverb, denoise). Training uses L1 loss and multi-resolution STFT loss. Data: MoisesDB and manually cleaned RawStems.

CUPAudioGroup

CUPAudioGroup uses an ensemble of BSRNN, BSRoformer, and MDX23. Pretrained parameters are sourced from the open-source "Music-Source-Separation-Training" project (ZFTurbo). Data: RawStems, MUSDB18-HQ, MoisesDB.

AC/DC

AC/DC submits DTT-BSR, a generator based on DTTNet (a dual-path TFC-TDF U-Net). The architecture incorporates Band-Sequence Modeling from BandSplitRNN to model subband and temporal correlations, and uses a RoPE Transformer Bottleneck for long-sequence handling and phase preservation. Training uses a Multi-Frequency Discriminator with multi-scale mel reconstruction loss, LSGAN adversarial loss, and feature-matching loss. Data: RawStems at 48kHz.

Hachimi

Hachimi uses the "Max version of the backbone" from their recent work. Training combines reconstruction loss and GAN loss. Data: MUSDB25, MUSDB18-HQ, MoisesDB, MedleyDB, RawStems, URMP, MAESTRO.

cp-jku

cp-jku proposes a two-stage pipeline: separation followed by restoration. Separation uses BandSplit-RoFormer to extract eight stems (including "other"). Restoration uses the HiFi++ GAN bundle (SpectralUNet, Upsampler, WaveUNet, SpectralMaskNet). The separator is trained in three stages with LoRA fine-tuning. The restorer is trained in five stages, producing eight source-specific expert models. Data: MUSDB18, DSD100, MoisesDB, Slakh2100, MedleyDB v2, RawStems, and others.