torchaudio was defaulting to the torchcodec backend which requires FFmpeg
shared libraries not present in the ComfyUI venv, silently skipping every
clip and producing an empty dataset.
Also add experiments/vocoder_finetune.json for the BJ vocoder LoRA run
(lr=3e-4, rank=128, 10k steps).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bug 1: mono.unfold(0, 2048, 512) returns an empty tensor for clips shorter
than 2048 samples (~46ms). torch.quantile on an empty tensor crashes with
"quantile() input tensor must be non-empty". Guard: return 60.0 (assume
clean) for clips too short to frame — the pipeline has no minimum-length
filter so any short file in the dataset folder would crash the Inspector.
Bug 2: torch.linspace(...) in _check_hf_shelf created a CPU tensor, making
band_lo/band_hi CPU boolean masks. Indexing a GPU mag_sq tensor with CPU
masks crashes. Pass device=mono.device so freqs lands on the same device
as the audio.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>