Commit Graph

10 Commits

Author SHA1 Message Date
Ethanfel 48493a3f0d feat: add SelvaDatasetSaver node with NPZ sidecar copy
Saves all clips in an AUDIO_DATASET to FLAC. When npz_source_dir is
provided, copies the matching .npz for each clip so FLAC/NPZ pairs
stay in sync after the inspector filters out bad clips.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:27:48 +02:00
Ethanfel becb38c27e fix: use soundfile for WAV/FLAC/OGG to bypass torchcodec/FFmpeg dependency
torchaudio was defaulting to the torchcodec backend which requires FFmpeg
shared libraries not present in the ComfyUI venv, silently skipping every
clip and producing an empty dataset.

Also add experiments/vocoder_finetune.json for the BJ vocoder LoRA run
(lr=3e-4, rank=128, 10k steps).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 15:16:22 +02:00
Ethanfel f50afa9796 fix: guard _estimate_snr against short clips, fix freqs device in _check_hf_shelf
Bug 1: mono.unfold(0, 2048, 512) returns an empty tensor for clips shorter
than 2048 samples (~46ms). torch.quantile on an empty tensor crashes with
"quantile() input tensor must be non-empty". Guard: return 60.0 (assume
clean) for clips too short to frame — the pipeline has no minimum-length
filter so any short file in the dataset folder would crash the Inspector.

Bug 2: torch.linspace(...) in _check_hf_shelf created a CPU tensor, making
band_lo/band_hi CPU boolean masks. Indexing a GPU mag_sq tensor with CPU
masks crashes. Pass device=mono.device so freqs lands on the same device
as the audio.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 14:28:36 +02:00
Ethanfel f1c4654bab feat: add SelvaDatasetItemExtractor node 2026-04-09 14:24:58 +02:00
Ethanfel 2d06cb2f52 fix: pass device to hann_window in _check_hf_shelf to avoid GPU mismatch 2026-04-09 14:22:13 +02:00
Ethanfel 0731addea9 feat: add SelvaDatasetInspector node (codec artifacts, SNR, clipping)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 14:20:03 +02:00
Ethanfel 7eb9bd5745 feat: add SelvaDatasetLUFSNormalizer node (pyloudnorm BS.1770-4)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 14:17:44 +02:00
Ethanfel 057bfb813d feat: add SelvaDatasetResampler node (soxr VHQ)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 14:13:45 +02:00
Ethanfel 2c71d4c184 feat: add SelvaDatasetLoader node
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 14:09:43 +02:00
Ethanfel d25df10aa5 feat: add audio dataset pipeline skeleton 2026-04-09 14:05:31 +02:00