fix: clamp x0 std after each optimizer step to prevent OOD noise

Optimized x0 was reaching std=2.72 vs expected ~1.0 for flow matching. An out-of-distribution initial condition maps to white noise in the output. After each step, rescale x0 back toward unit std if it exceeds 1.5. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:23:39 +02:00
parent 286681edff
commit fa6c4fa834
1 changed files with 8 additions and 0 deletions
@@ -456,6 +456,14 @@ def _do_optimize(net_generator, feature_utils, mel_converter,
        torch.nn.utils.clip_grad_norm_([x0], 1.0)
        optimizer.step()
        # Clamp x0 std to stay near unit Gaussian — flow matching ODE expects
        # x0 ~ N(0,1). Optimization can push std >> 1, which maps to an
        # out-of-distribution initial condition and produces white noise.
        with torch.no_grad():
            std = x0.std()
            if std > 1.5:
                x0.data.div_(std)
        pbar.update(1)
        if (opt_step + 1) % max(1, n_opt_steps // 10) == 0: