ComfyUI-SelVA

Author	SHA1	Message	Date
Ethanfel	a6d584bd34	fix: treat empty python_env as auto-managed venv trigger Empty string from clearing the node field caused subprocess to execute '' which raises PermissionError. Now any blank or 'python' value uses the auto-installed venv. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:21:16 +01:00
Ethanfel	829f398ed0	feat: verbose step-by-step logging in feature extraction - extract_features.py: 6 numbered steps with shapes, fps, frame counts - feature_extractor.py: stream subprocess output live (capture_output=False) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:19:38 +01:00
Ethanfel	f32456a142	feat: add fps input to PrismAudioFeatureExtractor Exposes the video frame rate as an optional input (default 30). Correct FPS ensures accurate temporal frame sampling in VideoPrism and Synchformer feature extraction. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:08:10 +01:00
Ethanfel	c416045ace	fix: replace torchvision.io.write_video with PIL+ffmpeg write_video requires the optional 'av' (PyAV) package. Use PIL to save frames as PNGs then combine with ffmpeg, which is always present in ComfyUI Docker images. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:03:39 +01:00
Ethanfel	824550bed3	feat: verbose per-package progress during venv auto-install Installs each package individually with [n/total] counters and pip progress bars, so failures pinpoint the exact failing package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:00:04 +01:00
Ethanfel	8f2e204146	fix: show pip output, handle incomplete venv, fix TF version for Python 3.12 - tensorflow-cpu==2.15.0 only supports Python <=3.11; relax to >=2.16.0 - capture_output=False so pip errors are visible in ComfyUI logs - clean up incomplete venv dir before retrying install Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 19:55:55 +01:00
Ethanfel	8e3ab999f0	fix: load VAE state dict with strict=False vae.ckpt is a full training checkpoint containing discriminator, STFT loss modules, and EMA wrappers that are absent from the inference AudioAutoencoder. strict=False ignores these training-only keys while still loading all encoder/decoder/bottleneck weights correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 19:51:51 +01:00
Ethanfel	35d0615253	feat: auto-install pip venv for feature extraction on first use PrismAudioFeatureExtractor now creates and populates a managed venv (_extract_env/) automatically when python_env is left as the default 'python'. Also adds scripts/install_extract_env.sh for manual/Docker setup without conda. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 19:27:27 +01:00
Ethanfel	618e7de64b	feat: PrismAudioTextOnly node with correct T5-Gemma encoding Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:09:11 +01:00
Ethanfel	3d62688e8c	feat: PrismAudioSampler node with correct metadata format and peak normalization Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:07:33 +01:00
Ethanfel	7c54ee8482	feat: PrismAudioFeatureExtractor node with subprocess bridge and conda env Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:06:10 +01:00
Ethanfel	3f35aa39f2	feat: PrismAudioFeatureLoader node for pre-computed .npz files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:04:32 +01:00
Ethanfel	1043f4bacb	feat: PrismAudioModelLoader node with auto-download and adaptive VRAM Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:02:47 +01:00
Ethanfel	baa80de194	feat: project scaffolding with shared utils and node registration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 16:59:21 +01:00

1 2

64 Commits