ComfyUI-SelVA

Ethanfel/ComfyUI-SelVA

Fork 0

Commit Graph

Select branches

Hide Pull Requests

deprecated/lora-trainer

deprecated/prismaudio

experiment/crop-to-mask

feature/lora-timestep-sampling

feature/lora-training

main

f9d092158a fix(ti): lower default lr/batch, add lr_batch sweep group Ethanfel 2026-04-08 23:42:22 +02:00
92535deab2 fix(ti-scheduler): save comparison image after each completed experiment Ethanfel 2026-04-08 23:39:30 +02:00
0b24207ca5 feat(ti-trainer): generate baseline.wav once before training starts Ethanfel 2026-04-08 23:33:28 +02:00
e1a2f0ed7d feat: add inject_mode (suffix/prefix) to TI pipeline Ethanfel 2026-04-08 23:31:52 +02:00
f96265da23 feat(ti-trainer): add loss curve IMAGE output Ethanfel 2026-04-08 23:20:44 +02:00
c0d95ce356 feat: add ti_sweep_1 experiment file Ethanfel 2026-04-08 23:14:31 +02:00
e37bfe1b1c feat: add SelVA TI Scheduler for sweep-based textual inversion experiments Ethanfel 2026-04-08 23:13:04 +02:00
bb07bc8169 fix(ti-trainer): guard spectral metrics, drop unused imports Ethanfel 2026-04-08 23:10:19 +02:00
e36cdd7947 fix(ti-trainer): fix gradient flow and spectral metric shapes Ethanfel 2026-04-08 23:08:13 +02:00
e56ece9c1c feat: add SelVA Textual Inversion Trainer and Loader nodes Ethanfel 2026-04-08 23:01:44 +02:00
eed7eefeac feat: add SelVA HF Smoother and Spectral Matcher preprocessing nodes Ethanfel 2026-04-08 20:28:16 +02:00
107bb05f17 fix(vae-roundtrip): pass bigvgan path to encoder-only FeaturesUtils Ethanfel 2026-04-08 20:05:44 +02:00
10e6095e31 fix(vae-roundtrip): use model feature_utils for decode, add normalize/unnormalize, normalize output Ethanfel 2026-04-08 19:50:01 +02:00
528d33be39 fix: trim/pad latent to seq_cfg.latent_seq_len before decoding Ethanfel 2026-04-08 19:22:09 +02:00
8195c3114a feat: add SelVA VAE Roundtrip node Ethanfel 2026-04-08 19:15:20 +02:00
c8e6b91f67 feat: add alpha_scale_sweep to fix LoRA noise contamination Ethanfel 2026-04-08 17:55:05 +02:00
fdce9cbbf1 feat: evaluate adapters on all dataset clips, not just clip_001 Ethanfel 2026-04-08 17:42:55 +02:00
42ceb4b153 fix: preserve original audio extension when copying reference file Ethanfel 2026-04-08 17:31:26 +02:00
4505b89db1 feat: add reference audio to LoRA evaluator Ethanfel 2026-04-08 17:30:33 +02:00
dbfa7b23fe feat: add eval_r128_candidates.json Ethanfel 2026-04-08 17:28:28 +02:00
d2e1ea7b80 feat: add SelVA LoRA Evaluator node Ethanfel 2026-04-08 17:26:50 +02:00
9a47508d2d fix: lower RMS normalization target from -23/-20 to -27 dBFS Ethanfel 2026-04-08 17:19:20 +02:00
678c050f11 fix: make normalize(x1) assignment explicit in training loop Ethanfel 2026-04-08 15:43:42 +02:00
1be07a80d2 feat: add cosine LR decay schedule to trainer and scheduler Ethanfel 2026-04-08 13:25:01 +02:00
58e1985af2 feat: SelVA Skip Experiment node + save partial scalars on skip Ethanfel 2026-04-08 13:10:43 +02:00
264dc49d42 feat: skip_current.flag to cancel experiment and move to next Ethanfel 2026-04-08 13:09:01 +02:00
fec5c86f09 feat: add spectral_flatness and temporal_variance to eval metrics Ethanfel 2026-04-08 12:45:40 +02:00
2861327016 feat: spectral metrics per eval sample in experiment summary Ethanfel 2026-04-08 12:44:43 +02:00
c4687521ef feat: save spectrogram PNG alongside each eval sample Ethanfel 2026-04-08 12:42:34 +02:00
8717af2728 fix: prevent saturation from RMS normalization clipping peaks Ethanfel 2026-04-08 12:29:29 +02:00
78e9838a83 fix: replace peak normalization with RMS normalization at -20 dBFS Ethanfel 2026-04-08 12:06:48 +02:00
94610b8943 feat: r128_sweet_spot sweep — noise-free LR search + rank 256 Ethanfel 2026-04-08 10:46:08 +02:00
f5f7f2ae68 fix: eval sample seed 0 -> 42 Ethanfel 2026-04-08 10:32:43 +02:00
1663b39833 fix: bump eval sample to 25 ODE steps (was 8) Ethanfel 2026-04-08 10:32:27 +02:00
a7923d5fb7 feat: r64_overnight sweep — focused rank-64 ablation at 8000 steps Ethanfel 2026-04-08 01:32:23 +02:00
786a57c424 feat: sweep resume + 5 additional experiments (LR, target, extended) Ethanfel 2026-04-08 00:59:16 +02:00
f15e02b0b8 fix: eval samples use fixed clip/seed, save to samples/ subfolder Ethanfel 2026-04-08 00:54:37 +02:00
0682a536cb fix: point data_dir to features/ subdir where .npz and audio live Ethanfel 2026-04-08 00:45:32 +02:00
0000878e76 feat: thorough overnight sweep + dataset browser updates Ethanfel 2026-04-08 00:38:19 +02:00
675644189d feat: add SelVA Dataset Browser node Ethanfel 2026-04-07 14:55:27 +02:00
82fb7a0009 docs: note AudioX shows no perceptual quality gain on V2A vs SelVA Ethanfel 2026-04-07 09:12:00 +02:00
af4777d2d7 docs: add AudioX vs SelVA evaluation Ethanfel 2026-04-07 09:11:09 +02:00
ed8abf7a5b docs: add video format recommendations to dataset preparation section Ethanfel 2026-04-06 13:44:14 +02:00
21ed93d3ee docs: add audio dataset pipeline reference doc Ethanfel 2026-04-06 13:37:48 +02:00
f1e2bbd55b feat: add first experiment sweep file for Tier 1 ablation Ethanfel 2026-04-06 13:15:06 +02:00
3d9221c248 fix: three bugs in scheduler and trainer Ethanfel 2026-04-06 13:11:25 +02:00
2d200395af feat: add grad norm logging and richer experiment summary output Ethanfel 2026-04-06 13:06:39 +02:00
3ec380a27e feat: add SelVA LoRA Scheduler node for automated experiment sweeps Ethanfel 2026-04-06 13:03:21 +02:00
9bc2568543 docs: document LoRA dropout, LoRA+, and curriculum timestep sampling Ethanfel 2026-04-06 12:45:53 +02:00
eb63c1ead7 feat: add LoRA dropout, LoRA+ asymmetric LR, and curriculum timestep sampling Ethanfel 2026-04-06 12:43:18 +02:00
5baa070e61 docs: add observations section with fp32/batch/precision findings Ethanfel 2026-04-06 02:34:52 +02:00
95136b53a0 docs: add observations section with fp32/batch/precision findings feature/lora-training Ethanfel 2026-04-06 02:34:52 +02:00
8f31d00beb docs: add prompt guide and masking note to dataset preparation section Ethanfel 2026-04-06 01:43:28 +02:00
9fc739fe9e docs: add prompt guide and masking note to dataset preparation section Ethanfel 2026-04-06 01:43:28 +02:00
57fae4a8ce chore: default timestep_mode back to uniform Ethanfel 2026-04-06 01:21:08 +02:00
8e919c0459 fix: resolve relative and Unix-style output_dir paths to ComfyUI output folder Ethanfel 2026-04-06 01:13:59 +02:00
3ee1893e10 fix: resolve relative and Unix-style output_dir paths to ComfyUI output folder Ethanfel 2026-04-06 01:13:59 +02:00
c86258d48f fix: save adapter and loss curves on cancel, not only on normal completion Ethanfel 2026-04-06 01:06:44 +02:00
fec8eaac95 fix: save adapter and loss curves on cancel, not only on normal completion Ethanfel 2026-04-06 01:06:44 +02:00
d83632e754 fix: pad/trim clip and sync features to fixed seq_len at dataset load time Ethanfel 2026-04-06 00:51:45 +02:00
8338560600 fix: pad/trim clip and sync features to fixed seq_len at dataset load time Ethanfel 2026-04-06 00:51:45 +02:00
a5014e49eb feat: add logit-normal timestep sampling to reduce white noise artifacts Ethanfel 2026-04-06 00:35:42 +02:00
8ae0ba3c7d fix: increment adapter_final filename on resume to avoid overwriting previous final Ethanfel 2026-04-06 00:15:31 +02:00
2b2b438307 fix: set OUTPUT_NODE=True on SelVA Feature Extractor so it runs without connected outputs Ethanfel 2026-04-06 00:11:16 +02:00
39984f73c2 docs: add observed batching results to training guide Ethanfel 2026-04-06 00:05:16 +02:00
1f8cd6f930 docs: rewrite LORA_TRAINING.md with real-world findings Ethanfel 2026-04-06 00:00:36 +02:00
20f8138146 chore: show batch_size in training step log Ethanfel 2026-04-05 23:45:43 +02:00
09b3b94ddd feat: add batch_size parameter to training (default 4) Ethanfel 2026-04-05 23:36:12 +02:00
3f67de694c feat: save loss_raw.png and loss_smoothed.png to output_dir Ethanfel 2026-04-05 23:15:48 +02:00
423e174b88 debug: print lora_A norm after loading to confirm adapter applied Ethanfel 2026-04-05 23:05:23 +02:00
4806daa4ca chore: lower default warmup_steps from 500 to 100 Ethanfel 2026-04-05 22:51:27 +02:00
16b3eb11cc fix: pass max_size=800 to progress bar preview (was 85px wide) Ethanfel 2026-04-05 22:48:56 +02:00
004ea63f62 fix: fall back to soundfile for torchaudio.save when torchcodec unavailable Ethanfel 2026-04-05 22:44:04 +02:00
afb3242eca fix: disable inference_mode entirely for training via inference_mode(False) Ethanfel 2026-04-05 22:40:50 +02:00
849f31e2a6 fix: create LoRA params inside torch.enable_grad() to escape inference_mode Ethanfel 2026-04-05 22:36:28 +02:00
505d445eb3 fix: wrap training loop in torch.enable_grad() Ethanfel 2026-04-05 22:32:00 +02:00
8fade1b0e3 fix: initialize LoRA params on same device as wrapped linear Ethanfel 2026-04-05 22:17:29 +02:00
ad57432803 fix: pad/trim latent to exact latent_seq_len after VAE encoding Ethanfel 2026-04-05 22:12:20 +02:00
43f732f904 fix: transpose VAE latent from [B,C,T] to [B,T,C] before generator Ethanfel 2026-04-05 22:08:00 +02:00
6b9adf0816 fix: fall back to soundfile when torchcodec FFmpeg libs are missing Ethanfel 2026-04-05 22:03:57 +02:00
52434a053a fix: keep VAE in float32 for mel/stft; print full traceback on clip load failure Ethanfel 2026-04-05 21:57:20 +02:00
56c8d5d6b4 feat: save eval audio sample alongside each checkpoint Ethanfel 2026-04-05 21:47:02 +02:00
b430953602 feat: live loss curve preview during training Ethanfel 2026-04-05 17:11:38 +02:00
57cd3dd4b4 fix: use load_lora for resume and remove redundant inference_mode wrapper Ethanfel 2026-04-05 17:09:35 +02:00
f206a1b38c feat: add SelVA LoRA Trainer ComfyUI node Ethanfel 2026-04-05 17:07:38 +02:00
2f4641247a feat: add resume support to train_lora.py Ethanfel 2026-04-05 16:59:30 +02:00
8e9114b92c docs: add clip length and scalable dataset size recommendations Ethanfel 2026-04-05 16:34:50 +02:00
63b4391573 fix: named .npz files always start at _001 Ethanfel 2026-04-05 15:44:26 +02:00
89af5a468c docs: add LoRA training guide Ethanfel 2026-04-05 15:43:09 +02:00
c88e27742c fix: sanitize name field and remove double load_npz call Ethanfel 2026-04-05 15:30:25 +02:00
cbcd154c96 feat: add name field with auto-increment to SelvaFeatureExtractor Ethanfel 2026-04-05 15:16:51 +02:00
1eb82d8050 refactor: train_lora accepts .npz + audio pairs instead of raw video Ethanfel 2026-04-05 15:14:26 +02:00
cde280049b fix: correct LoRALinear dtype and remove unused import Ethanfel 2026-04-05 14:57:09 +02:00
437c62b28f feat: LoRA fine-tuning for SelVA generator Ethanfel 2026-04-05 14:38:46 +02:00
c9550ce693 experiment: add crop_rect option — rect bbox crop without squarification experiment/crop-to-mask Ethanfel 2026-04-05 13:04:46 +02:00
f3cabcad90 experiment: crop-to-mask mode on feature extractor Ethanfel 2026-04-05 12:52:03 +02:00
b519b042e2 docs: document mask inputs and normalize toggle in README main Ethanfel 2026-04-05 10:43:42 +02:00
f28759f1e3 feat: improve mask support with neutral fill, mask_strength, and per-path toggles Ethanfel 2026-04-05 10:43:01 +02:00
3dd6badfd9 fix: guarantee offload cleanup on exception with try/finally Ethanfel 2026-04-05 08:40:39 +02:00
8bb2fb7015 fix: extend OOM catch to decode/vocode, add (masked) to sync log line Ethanfel 2026-04-05 08:38:59 +02:00

1 2 3

Commit Graph Select branches Hide Pull Requests deprecated/lora-trainer deprecated/prismaudio experiment/crop-to-mask feature/lora-timestep-sampling feature/lora-training main Mono Color

Commit Graph

Select branches

Hide Pull Requests

deprecated/lora-trainer

deprecated/prismaudio

experiment/crop-to-mask

feature/lora-timestep-sampling

feature/lora-training

main