ComfyUI-SelVA

Ethanfel/ComfyUI-SelVA

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ethanfel	63bd999dfa	fix: switch to VideoPrism large (1024-dim) and fix Synchformer output shape prismaudio.json conditioner config requires: - video_features: dim=1024 → switch videoprism_public_v1_base → large (ViT-L) - sync_features: dim=768, length divisible by 8 → expand [num_seg,768] to [num_seg*8,768] (per-frame) so Sync_MLP can reshape by groups of 8 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 21:07:17 +01:00
Ethanfel	b1a2ee594e	fix: correct VideoPrism import (videoprism.models, not videoprism); add flax dep videoprism/__init__.py is empty — API lives in videoprism.models. Fix: from videoprism import models as vp (not import videoprism as vp). Also add flax to managed venv packages (required by videoprism Flax model). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:38:00 +01:00
Ethanfel	878025450a	feat: add data_utils package with FeaturesUtils implementation Creates data_utils/v2a_utils/feature_utils_288.py with FeaturesUtils: - T5-Gemma text encoding via transformers - VideoPrism video encoding via JAX videoprism package - Synchformer visual encoder loading from checkpoint Also fixes extract_features.py to add plugin root to sys.path so data_utils is importable in the subprocess venv. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 20:14:34 +01:00

Author

SHA1

Message

Date

Ethanfel

63bd999dfa

fix: switch to VideoPrism large (1024-dim) and fix Synchformer output shape

prismaudio.json conditioner config requires:
- video_features: dim=1024 → switch videoprism_public_v1_base → large (ViT-L)
- sync_features: dim=768, length divisible by 8 → expand [num_seg,768] to
  [num_seg*8,768] (per-frame) so Sync_MLP can reshape by groups of 8

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-27 21:07:17 +01:00

Ethanfel

b1a2ee594e

fix: correct VideoPrism import (videoprism.models, not videoprism); add flax dep

videoprism/__init__.py is empty — API lives in videoprism.models.
Fix: from videoprism import models as vp (not import videoprism as vp).
Also add flax to managed venv packages (required by videoprism Flax model).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-27 20:38:00 +01:00

Ethanfel

878025450a

feat: add data_utils package with FeaturesUtils implementation

Creates data_utils/v2a_utils/feature_utils_288.py with FeaturesUtils:
- T5-Gemma text encoding via transformers
- VideoPrism video encoding via JAX videoprism package
- Synchformer visual encoder loading from checkpoint

Also fixes extract_features.py to add plugin root to sys.path so
data_utils is importable in the subprocess venv.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-27 20:14:34 +01:00

3 Commits