Commit Graph

3 Commits

Author SHA1 Message Date
Ethanfel 8ccc2438e4 fix: remove FlashSR (audiosr incompatible with Python 3.12), add training loss CSV
- Drop SelvaFlashSR node — audiosr pins numpy<=1.23.5 which cannot build
  on Python 3.12 (pkgutil.ImpImporter removed); use Saganaki22/ComfyUI-AudioSR instead
- BigVGAN trainer now writes <output_stem>_training_log.csv alongside the
  checkpoint: step, total, fm, mel, stft, phase, l2sp columns, line-buffered
  so loss can be tailed live during training

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:18:34 +02:00
Ethanfel ba0499b77c fix: FlashSR device handling and remove unused tmp_out
Use device="auto" for audiosr.build_model — safer than passing a device
string that may not be accepted in all audiosr versions.
Remove unused tmp_out temp file that was created but never written to.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 16:32:02 +02:00
Ethanfel ce62bccc1f feat: add post-generation audio enhancement nodes
Three new nodes for post-generation quality improvement:

- SelvaHarmonicExciter: multi-band exciter (HPF → tanh saturation → mix)
  restores harmonic richness lost in BigVGAN HF reconstruction

- SelvaFlashSR: audio super-resolution via FlashSR basic model
  (haoheliu/versatile_audio_super_resolution, requires pip install audiosr)
  predicts missing HF content above vocoder reconstruction ceiling

- SelvaOutputNormalizer: BS.1770-4 LUFS normalization + true peak limiting
  for consistent loudness on generated outputs (pyloudnorm)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 16:27:39 +02:00