feat: add AF-Vocoder GAFilter to BigVGAN trainer and loader

Implements AF-Vocoder GAFilter (Interspeech 2025): learnable per-channel
depthwise FIR filter inserted after each Snake/Activation1d in BigVGAN
residual blocks. Initialized as identity so training starts from pretrained
behaviour.

- inject_gafilters() walks resblocks.*.activations and wraps each Activation1d
  with _ActivationWithGAFilter — weights appear in vocoder.state_dict() automatically
- Trained alongside Snake alphas in snake_alpha_only mode
- Checkpoint saves has_gafilter + gafilter_kernel_size metadata
- Loader detects metadata and injects before load_state_dict so weights populate correctly
- Controlled by use_gafilter (default True) and gafilter_kernel_size (default 9)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-09 16:15:14 +02:00
parent c53ea5517c
commit db112394e8
2 changed files with 107 additions and 8 deletions
+6
View File
@@ -13,6 +13,7 @@ import torch
import folder_paths
from .utils import SELVA_CATEGORY
from .selva_bigvgan_trainer import inject_gafilters
class SelvaBigvganLoader:
@@ -60,6 +61,11 @@ class SelvaBigvganLoader:
else:
raise ValueError(f"[BigVGAN] Unknown mode: {mode}")
if ckpt.get("has_gafilter", False):
kernel_size = ckpt.get("gafilter_kernel_size", 9)
n_gaf = inject_gafilters(vocoder, kernel_size)
print(f"[BigVGAN] GAFilter injected: {n_gaf} filters kernel={kernel_size}", flush=True)
vocoder.load_state_dict(ckpt["generator"])
vocoder.eval()