ComfyUI-SelVA

Author	SHA1	Message	Date
Ethanfel	056a7b973d	fix: enable VAE encoder in model loader — required for DITTO reference encoding need_vae_encoder=False was deleting the encoder to save a small amount of VRAM. DITTO now needs it to encode reference clips to latent space for style loss. The spectrogram VAE encoder is small enough that the overhead is negligible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 18:15:27 +02:00
Ethanfel	bd53744e2d	feat: comprehensive node improvements Model Loader: - bf16 support check — auto-falls back to fp16 on unsupported GPUs - DESCRIPTION and OUTPUT_TOOLTIPS Feature Extractor: - Store variant in features dict and .npz cache - Progress bar (3 steps: CLIP encode, T5 encode, sync encode) - Expand cache hash to 32 hex chars - DESCRIPTION and OUTPUT_TOOLTIPS Sampler: - Variant mismatch validation against extracted features - Cancellation support via throw_exception_if_processing_interrupted() - OOM catch with actionable error message - normalize toggle (optional BOOLEAN, default true) for peak normalization - Remove empty optional: {} block - DESCRIPTION and OUTPUT_TOOLTIPS Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 18:16:03 +02:00
Ethanfel	429810db5b	docs: improve tooltips on all three SelVA nodes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 18:10:05 +02:00
Ethanfel	83b1da9520	chore: remove all PrismAudio code from main branch - Delete prismaudio_core/, data_utils/, scripts/, docs/plans/ - Delete PrismAudio nodes (feature_extractor, feature_loader, model_loader, sampler, text_only) - Delete PrismAudio workflows (video_to_audio, text_to_audio) - Clean nodes/utils.py: rename PRISMAUDIO_CATEGORY → SELVA_CATEGORY, remove unused helpers - Strip PrismAudio-only deps from requirements.txt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 17:58:31 +02:00
Ethanfel	2c9d521565	fix: 44k generator HF paths use 44khz suffix (not 44k) Actual filenames in jnwnlee/SelVA: generator_*_44khz_sup_5.pth. download_utils.py had the wrong names so those MD5s are unverified — set to None to skip MD5 check for 44k generators. All other files verified/unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:46:20 +02:00
Ethanfel	28229d62ce	fix: MD5 validation on existing files — re-download if corrupt Previously _ensure() trusted any existing file. Files downloaded by the broken requests-based code (HTML error pages) would be silently reused. Now checks MD5 on every load; deletes and re-downloads on mismatch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:42:38 +02:00
Ethanfel	92593189f0	fix: use huggingface_hub for downloads instead of raw requests download_utils.py used requests without auth — jnwnlee/SelVA returned an HTML error page which torch then failed to unpickle ('E' / opcode 69). huggingface_hub.hf_hub_download() handles HF_TOKEN auth automatically, validates downloads, and retries. Files are still copied to models/selva/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:41:29 +02:00
Ethanfel	614a2e02aa	fix: weights_only=False for SelVA checkpoints (PyTorch 2.6 compat) PyTorch 2.6 changed the default to weights_only=True. SelVA checkpoints contain non-tensor types (numpy scalars etc.) that fail strict unpickling. All weights come from trusted sources (jnwnlee/selva HF repo). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:38:31 +02:00
Ethanfel	40388ba6de	fix: negative_prompt inline (multiline:false) + VAE filename v1-44.pth not v1-44k.pth - SelvaSampler: multiline:false puts negative_prompt inline above sliders - SelvaModelLoader: VAE filenames in download_utils are v1-16.pth/v1-44.pth, not v1-{mode}.pth (mode includes the 'k' suffix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:35:17 +02:00
Ethanfel	9a985499e7	feat: auto-download SelVA weights on first use Uses selva_core/utils/download_utils.py (already has URLs + MD5s for all weights). Models download to models/selva/ on first load. Synchformer reuses models/prismaudio/synchformer_state_dict.pth if already present (no duplicate download for PrismAudio users), otherwise downloads to models/selva/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 16:25:36 +02:00
Ethanfel	fe94438356	feat: SelvaModelLoader node — loads TextSynch + MMAudio + FeaturesUtils Resolves weights from models/selva/. Reuses synchformer_state_dict.pth from models/prismaudio/ (no duplicate download). Supports four variants: small_16k / small_44k / medium_44k / large_44k. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-04 15:21:03 +02:00

11 Commits