ComfyUI-SelVA

Files

T

Ethanfel a315093743 feat: sync_strength control and temporal coverage diagnostic in sampler

Adds sync_strength (0.0–3.0, default 1.0) to PrismAudioSampler.
The scale is applied post-conditioner (after Sync_MLP) to the conditioning
tensor before it enters the DiT. Since CFG always uses zeros as the null
sync embedding, this cleanly scales the sync guidance signal:
  effective_sync_guidance = cfg_scale * (sync_strength * cond - 0)
Higher values tighten temporal audio-video alignment; 0.0 disables sync
guidance entirely (audio conditioned only by video + text features).
Not applied in T2A mode where sync is replaced by the learned empty_sync_feat.

Also logs sync temporal coverage vs audio target duration, with a warning
when they differ by more than 0.5s (stale or mismatched features).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-28 16:23:41 +01:00

__init__.py

feat: LoRA trainer and loader nodes for PrismAudio DiT fine-tuning

2026-03-28 12:18:50 +01:00

feature_extractor.py

fix: feature extractor CUDA detection, cache correctness, and short-video crash

2026-03-28 16:00:05 +01:00

feature_loader.py

feat: PrismAudioFeatureLoader node for pre-computed .npz files

2026-03-27 18:04:32 +01:00

lora_loader.py

fix: guard model cleanup in try/finally and fix DiTWrapper comments