fix: lower RMS normalization target from -23/-20 to -27 dBFS

Training clips at -23 LUFS measure -25 to -31 dBFS RMS (avg ~-27).
Normalizing output to -23 dBFS was 4-8 dB too loud, causing saturation
on clips with high crest factor and peaks near 0 dBFS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-08 17:19:20 +02:00
parent 678c050f11
commit 9a47508d2d
2 changed files with 4 additions and 4 deletions
+1 -1
View File
@@ -139,7 +139,7 @@ def _eval_sample(generator, feature_utils_orig, dataset, seq_cfg, device, dtype,
elif audio.dim() == 3 and audio.shape[1] != 1:
audio = audio.mean(dim=1, keepdim=True)
target_rms = 10 ** (-23.0 / 20.0) # -23 dBFS matches training data
target_rms = 10 ** (-27.0 / 20.0) # -27 dBFS matches measured RMS of training clips
rms = audio.pow(2).mean().sqrt().clamp(min=1e-8)
audio = audio * (target_rms / rms)
peak = audio.abs().max().clamp(min=1e-8)