ComfyUI-SelVA

Author	SHA1	Message	Date
Ethanfel	ecf828b007	fix: move vocoder to correct device after GAFilter injection inject_gafilters creates Conv1d modules on CPU. load_state_dict preserves existing param devices but GAFilter params stay on CPU, causing device mismatch during vocode. Save target device before injection, then move entire vocoder after loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 20:28:55 +02:00
Ethanfel	db112394e8	feat: add AF-Vocoder GAFilter to BigVGAN trainer and loader Implements AF-Vocoder GAFilter (Interspeech 2025): learnable per-channel depthwise FIR filter inserted after each Snake/Activation1d in BigVGAN residual blocks. Initialized as identity so training starts from pretrained behaviour. - inject_gafilters() walks resblocks.*.activations and wraps each Activation1d with _ActivationWithGAFilter — weights appear in vocoder.state_dict() automatically - Trained alongside Snake alphas in snake_alpha_only mode - Checkpoint saves has_gafilter + gafilter_kernel_size metadata - Loader detects metadata and injects before load_state_dict so weights populate correctly - Controlled by use_gafilter (default True) and gafilter_kernel_size (default 9) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 16:15:14 +02:00
Ethanfel	790a53e3df	fix(bigvgan): add 44k/BigVGANv2 support to trainer and loader 44k variants use BigVGANv2 directly as the vocoder (no wrapper, no @inference_mode decorator), accessible at feature_utils.tod.vocoder. 16k wraps BigVGANVocoder inside BigVGAN, accessed at .vocoder.vocoder. Both trainer and loader now branch on model["mode"]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 01:28:32 +02:00
Ethanfel	9c784b4bdb	feat: add BigVGAN vocoder fine-tuner and loader nodes Spectral-loss-only fine-tuning of the BigVGAN vocoder (mel→waveform) on BJ audio clips. DiT and VAE are completely frozen. Losses: mel L1 reconstruction + multi-resolution STFT magnitude L1 (same three resolutions as the BigVGAN discriminator config). Saves in {'generator': state_dict} format compatible with the original BigVGAN checkpoint. Loader replaces vocoder weights in the loaded SELVA_MODEL in-place so no full model reload is needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 01:26:12 +02:00

4 Commits