ComfyUI-SelVA

Ethanfel/ComfyUI-SelVA

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ethanfel	e1a2f0ed7d	feat: add inject_mode (suffix/prefix) to TI pipeline Observation: n4_baseline loss barely moved (1.025→0.965 over 3000 steps), token_norm grew linearly without plateau — generator likely ignores last-K CLIP positions (EOS/padding zone) where suffix injects. Fix: add inject_mode parameter throughout the pipeline: - "suffix": replace last K positions (original behavior, model may ignore) - "prefix": replace positions 1:1+K right after BOS — highest attention weight in CLIP, much stronger gradient signal expected Changes: - selva_textual_inversion_trainer.py: _inject_tokens() helper centralises the torch.cat construction for both modes; used in training loop and eval; inject_mode stored in checkpoint files - selva_textual_inversion_loader.py: reads inject_mode from checkpoint, includes in TEXTUAL_INVERSION bundle - selva_sampler.py: uses _inject_tokens() via bundle's inject_mode field - selva_ti_scheduler.py: inject_mode in _PARAM_DEFAULTS, config, and _train_inner call - ti_sweep_1.json: updated with prefix_inject group (n4, n8, n4+warm); n4_baseline marked completed; suffix experiments retained for comparison Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 23:31:52 +02:00
Ethanfel	e56ece9c1c	feat: add SelVA Textual Inversion Trainer and Loader nodes Learns K CLIP token embeddings ([K, 1024]) with all model weights frozen, keeping generated latents on the decoder's natural manifold — avoids the quality degradation that affects LoRA on BJ's audio dataset. - selva_textual_inversion_trainer.py: trains learned_tokens via AdamW, injects into last K positions of 77-token CLIP embedding, checkpoints with eval audio + spectral metrics - selva_textual_inversion_loader.py: loads .pt bundle, returns TEXTUAL_INVERSION dict for sampler - selva_sampler.py: optional textual_inversion input; injects into both text_clip and neg_text_clip before preprocess_conditions - __init__.py: registers both new nodes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 23:01:44 +02:00

Author

SHA1

Message

Date

Ethanfel

e1a2f0ed7d

feat: add inject_mode (suffix/prefix) to TI pipeline

Observation: n4_baseline loss barely moved (1.025→0.965 over 3000 steps),
token_norm grew linearly without plateau — generator likely ignores last-K
CLIP positions (EOS/padding zone) where suffix injects.

Fix: add inject_mode parameter throughout the pipeline:
- "suffix": replace last K positions (original behavior, model may ignore)
- "prefix": replace positions 1:1+K right after BOS — highest attention
  weight in CLIP, much stronger gradient signal expected

Changes:
- selva_textual_inversion_trainer.py: _inject_tokens() helper centralises
  the torch.cat construction for both modes; used in training loop and eval;
  inject_mode stored in checkpoint files
- selva_textual_inversion_loader.py: reads inject_mode from checkpoint,
  includes in TEXTUAL_INVERSION bundle
- selva_sampler.py: uses _inject_tokens() via bundle's inject_mode field
- selva_ti_scheduler.py: inject_mode in _PARAM_DEFAULTS, config, and
  _train_inner call
- ti_sweep_1.json: updated with prefix_inject group (n4, n8, n4+warm);
  n4_baseline marked completed; suffix experiments retained for comparison

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-08 23:31:52 +02:00

Ethanfel

e56ece9c1c

feat: add SelVA Textual Inversion Trainer and Loader nodes

Learns K CLIP token embeddings ([K, 1024]) with all model weights frozen,
keeping generated latents on the decoder's natural manifold — avoids the
quality degradation that affects LoRA on BJ's audio dataset.

- selva_textual_inversion_trainer.py: trains learned_tokens via AdamW,
  injects into last K positions of 77-token CLIP embedding, checkpoints
  with eval audio + spectral metrics
- selva_textual_inversion_loader.py: loads .pt bundle, returns
  TEXTUAL_INVERSION dict for sampler
- selva_sampler.py: optional textual_inversion input; injects into both
  text_clip and neg_text_clip before preprocess_conditions
- __init__.py: registers both new nodes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-08 23:01:44 +02:00

2 Commits