-
f4a7292cde
feat: add optional MASK input to SelVA Feature Extractor
Ethanfel
2026-04-05 08:34:13 +02:00
-
bd53744e2d
feat: comprehensive node improvements
Ethanfel
2026-04-04 18:16:03 +02:00
-
429810db5b
docs: improve tooltips on all three SelVA nodes
Ethanfel
2026-04-04 18:10:05 +02:00
-
57f56c04e2
feat: update demo workflow with VHS_VideoCombine output
Ethanfel
2026-04-04 18:07:56 +02:00
-
ff26d0b87d
fix: bug sweep and improvements
Ethanfel
2026-04-04 18:04:35 +02:00
-
83b1da9520
chore: remove all PrismAudio code from main branch
Ethanfel
2026-04-04 17:58:31 +02:00
-
679a607a85
feat: wire prompt output from feature extractor to sampler in demo workflow
Ethanfel
2026-04-04 17:13:23 +02:00
-
d495939367
docs: rewrite README for SelVA
Ethanfel
2026-04-04 17:12:28 +02:00
-
982d66e078
chore: remove PrismAudio nodes from selva-integration branch
Ethanfel
2026-04-04 17:01:21 +02:00
-
b4124f58b3
fix: BigVGANv2._from_pretrained() compat with newer huggingface_hub
Ethanfel
2026-04-04 16:51:48 +02:00
-
2c9d521565
fix: 44k generator HF paths use 44khz suffix (not 44k)
Ethanfel
2026-04-04 16:46:20 +02:00
-
28229d62ce
fix: MD5 validation on existing files — re-download if corrupt
Ethanfel
2026-04-04 16:42:38 +02:00
-
92593189f0
fix: use huggingface_hub for downloads instead of raw requests
Ethanfel
2026-04-04 16:41:29 +02:00
-
614a2e02aa
fix: weights_only=False for SelVA checkpoints (PyTorch 2.6 compat)
Ethanfel
2026-04-04 16:38:31 +02:00
-
40388ba6de
fix: negative_prompt inline (multiline:false) + VAE filename v1-44.pth not v1-44k.pth
Ethanfel
2026-04-04 16:35:17 +02:00
-
789e09535d
fix: SelvaSampler — negative_prompt above settings
Ethanfel
2026-04-04 16:31:53 +02:00
-
4da4858e4a
fix: inline prune helpers when removed from both transformers locations
Ethanfel
2026-04-04 16:30:58 +02:00
-
ab8e1e5b7b
feat: SelvaFeatureExtractor outputs prompt as STRING
Ethanfel
2026-04-04 16:27:49 +02:00
-
e3a3384727
fix: SelvaSampler input order — prompt required, negative_prompt optional
Ethanfel
2026-04-04 16:27:07 +02:00
-
9a985499e7
feat: auto-download SelVA weights on first use
Ethanfel
2026-04-04 16:25:36 +02:00
-
27b4424e1a
feat: prompt entered once in SelvaFeatureExtractor, reused by SelvaSampler
Ethanfel
2026-04-04 16:22:59 +02:00
-
0e417f4078
fix: transformers compat — find_pruneable_heads_and_indices import
Ethanfel
2026-04-04 16:21:26 +02:00
-
6474e2816c
fix: two bugs in SelVA nodes
Ethanfel
2026-04-04 15:39:57 +02:00
-
c23d210ab2
feat: SelVA video-to-audio example workflow
Ethanfel
2026-04-04 15:31:53 +02:00
-
b59b657b6f
feat: SelvaSampler — flow matching ODE with CFG and negative prompts
Ethanfel
2026-04-04 15:31:18 +02:00
-
578b501d38
feat: SelvaFeatureExtractor — inline CLIP + TextSynchformer feature extraction
Ethanfel
2026-04-04 15:23:40 +02:00
-
fe94438356
feat: SelvaModelLoader node — loads TextSynch + MMAudio + FeaturesUtils
Ethanfel
2026-04-04 15:21:03 +02:00
-
6bc3fd6443
chore: vendor selva_core from jnwnlee/selva@d7d40a9
Ethanfel
2026-04-04 15:18:09 +02:00
-
0f60a9b2bf
docs: add SelVA integration implementation plan
deprecated/lora-trainer
Ethanfel
2026-04-04 15:11:26 +02:00
-
51f93f9688
docs: SelVA integration design doc
Ethanfel
2026-04-04 15:00:40 +02:00
-
a315093743
feat: sync_strength control and temporal coverage diagnostic in sampler
Ethanfel
2026-03-28 16:23:41 +01:00
-
e49f760b77
fix: feature extractor CUDA detection, cache correctness, and short-video crash
Ethanfel
2026-03-28 16:00:05 +01:00
-
4f40e15db3
fix: guard model cleanup in try/finally and fix DiTWrapper comments
Ethanfel
2026-03-28 15:49:04 +01:00
-
08d73773c5
feat: LoRA trainer and loader nodes for PrismAudio DiT fine-tuning
Ethanfel
2026-03-28 12:18:50 +01:00
-
-
762b19fd3a
fix: return fps from non-cache extraction path
deprecated/prismaudio
Ethanfel
2026-03-28 11:26:15 +01:00
-
807a2e51fb
docs: fix README references — PrismAudio not ThinkSound
Ethanfel
2026-03-28 11:16:31 +01:00
-
67be94c45c
chore: add updated V2A example workflow
Ethanfel
2026-03-28 11:13:06 +01:00
-
681d230b0c
chore: update T2A workflow to match V2A style and current defaults
Ethanfel
2026-03-28 11:11:20 +01:00
-
62a3c5d0dc
docs: rewrite README to reflect current node design
Ethanfel
2026-03-28 11:10:07 +01:00
-
30631c0cb4
fix: change fps output type from INT to FLOAT
Ethanfel
2026-03-28 11:05:35 +01:00
-
d0c9a72782
feat: add fps INT output to PrismAudioFeatureExtractor
Ethanfel
2026-03-28 11:05:03 +01:00
-
5b62be0447
chore: update default steps=100 and cfg_scale=7.0
Ethanfel
2026-03-28 11:03:48 +01:00
-
abd315092b
feat: auto-use video duration from features when duration=0
Ethanfel
2026-03-28 11:00:47 +01:00
-
972d379369
refactor: simplify feature extractor inputs
Ethanfel
2026-03-28 10:55:08 +01:00
-
8969d407f6
feat: accept VHS_VIDEOINFO to auto-set fps in feature extractor
Ethanfel
2026-03-28 10:52:51 +01:00
-
707ccb463e
perf: replace MP4 encode/decode with lossless .npy frame transfer
Ethanfel
2026-03-28 10:50:35 +01:00
-
c38df8c6fa
chore: remove debug options and diagnostic logging
Ethanfel
2026-03-28 10:47:00 +01:00
-
2f626d8a96
fix: use videoprism_lvt_public_v1_large with joint video-text forward
Ethanfel
2026-03-28 10:37:02 +01:00
-
1d8b9b59e0
debug: add DIT velocity diagnostic at t=1 to isolate DIT vs VAE quality issue
Ethanfel
2026-03-27 23:57:03 +01:00
-
8bf4a0c3fc
debug: log conditioner output stats and T2A text feature stats
Ethanfel
2026-03-27 22:39:44 +01:00
-
477fe0f08f
debug: add latent and audio stats logging to V2A sampler
Ethanfel
2026-03-27 22:28:08 +01:00
-
c0b7ccbcee
fix: substitute empty_clip_feat for video features when no video present
Ethanfel
2026-03-27 22:13:22 +01:00
-
45633788a4
debug: add latent and audio stats logging to T2A node
Ethanfel
2026-03-27 22:06:39 +01:00
-
11457fc27a
debug: fix VAE load_state_dict diagnostic — load into .model directly
Ethanfel
2026-03-27 21:56:06 +01:00
-
f2705b3063
debug: log weight load stats for diffusion and VAE checkpoints
Ethanfel
2026-03-27 21:53:25 +01:00
-
83a7f2787b
feat: add debug_zero_video/sync toggles and feature stats logging to sampler
Ethanfel
2026-03-27 21:40:34 +01:00
-
140cc5ee9a
feat: implement real Synchformer visual encoder (TimeSformer ViT-B/16)
Ethanfel
2026-03-27 21:28:20 +01:00
-
f99d2666e8
fix: interpolate sync_cond to match audio sequence length in transformer
Ethanfel
2026-03-27 21:21:39 +01:00
-
934a401633
perf: replace PIL+PNG frame files with direct ffmpeg stdin pipe
Ethanfel
2026-03-27 21:20:00 +01:00
-
b3ac9ab22f
feat: log MP4 conversion time before subprocess spawn
Ethanfel
2026-03-27 21:19:26 +01:00
-
ca87c41a2e
feat: add per-step timing to feature extraction logs
Ethanfel
2026-03-27 21:13:42 +01:00
-
63bd999dfa
fix: switch to VideoPrism large (1024-dim) and fix Synchformer output shape
Ethanfel
2026-03-27 21:07:17 +01:00
-
20fb766ad2
fix: cast tensors to float32 before numpy() in feature save
Ethanfel
2026-03-27 20:56:52 +01:00
-
93120eb6b9
feat: auto-resolve synchformer checkpoint from prismaudio models dir
Ethanfel
2026-03-27 20:49:56 +01:00
-
b1a2ee594e
fix: correct VideoPrism import (videoprism.models, not videoprism); add flax dep
Ethanfel
2026-03-27 20:38:00 +01:00
-
0f46e8359d
feat: switch managed venv to jax[cuda13] for GPU feature extraction
Ethanfel
2026-03-27 20:33:45 +01:00
-
06f8dbbab4
feat: add hf_token input and HF_TOKEN env forwarding to feature extractor
Ethanfel
2026-03-27 20:27:33 +01:00
-
a6d584bd34
fix: treat empty python_env as auto-managed venv trigger
Ethanfel
2026-03-27 20:21:16 +01:00
-
829f398ed0
feat: verbose step-by-step logging in feature extraction
Ethanfel
2026-03-27 20:19:38 +01:00
-
878025450a
feat: add data_utils package with FeaturesUtils implementation
Ethanfel
2026-03-27 20:14:34 +01:00
-
f32456a142
feat: add fps input to PrismAudioFeatureExtractor
Ethanfel
2026-03-27 20:08:10 +01:00
-
c416045ace
fix: replace torchvision.io.write_video with PIL+ffmpeg
Ethanfel
2026-03-27 20:03:39 +01:00
-
824550bed3
feat: verbose per-package progress during venv auto-install
Ethanfel
2026-03-27 20:00:04 +01:00
-
8f2e204146
fix: show pip output, handle incomplete venv, fix TF version for Python 3.12
Ethanfel
2026-03-27 19:55:55 +01:00
-
8e3ab999f0
fix: load VAE state dict with strict=False
Ethanfel
2026-03-27 19:51:51 +01:00
-
afc7d5b657
fix: add missing runtime dependencies to requirements.txt
Ethanfel
2026-03-27 19:48:33 +01:00
-
e372cdc488
fix: add plugin root to sys.path so prismaudio_core is importable
Ethanfel
2026-03-27 19:41:11 +01:00
-
7671d296fa
fix: remove spurious caption_cot input entry from video_to_audio workflow
Ethanfel
2026-03-27 19:39:05 +01:00
-
3894fcc9b4
feat: add demo workflows for text-to-audio and video-to-audio
Ethanfel
2026-03-27 19:32:24 +01:00
-
35d0615253
feat: auto-install pip venv for feature extraction on first use
Ethanfel
2026-03-27 19:27:27 +01:00
-
9b1cb71b2a
fix: remove MMDiTWrapper import and dead code paths from factory.py
Ethanfel
2026-03-27 19:12:40 +01:00
-
807f00417f
docs: README with installation and usage instructions
Ethanfel
2026-03-27 18:15:17 +01:00
-
618e7de64b
feat: PrismAudioTextOnly node with correct T5-Gemma encoding
Ethanfel
2026-03-27 18:09:11 +01:00
-
3d62688e8c
feat: PrismAudioSampler node with correct metadata format and peak normalization
Ethanfel
2026-03-27 18:07:33 +01:00
-
7c54ee8482
feat: PrismAudioFeatureExtractor node with subprocess bridge and conda env
Ethanfel
2026-03-27 18:06:10 +01:00
-
3f35aa39f2
feat: PrismAudioFeatureLoader node for pre-computed .npz files
Ethanfel
2026-03-27 18:04:32 +01:00
-
1043f4bacb
feat: PrismAudioModelLoader node with auto-download and adaptive VRAM
Ethanfel
2026-03-27 18:02:47 +01:00
-
8b634923dd
fix: remove unused tqdm import from sampling.py
Ethanfel
2026-03-27 18:01:29 +01:00
-
87bea21d49
feat: extract prismaudio_core inference with callback-enabled sampling
Ethanfel
2026-03-27 17:59:37 +01:00
-
30e85f0f99
fix: resolve critical bugs and quality issues in prismaudio_core/models
Ethanfel
2026-03-27 17:56:02 +01:00
-
6e1186d5bd
fix: clean up dead code paths and debug artifacts in prismaudio_core/models
Ethanfel
2026-03-27 17:49:57 +01:00
-
84c81e0e55
feat: extract prismaudio_core model modules (DiT, conditioners, VAE, diffusion)
Ethanfel
2026-03-27 17:31:22 +01:00
-
b60ff4111b
feat: extract prismaudio_core config and model factory
Ethanfel
2026-03-27 17:05:57 +01:00
-
baa80de194
feat: project scaffolding with shared utils and node registration
Ethanfel
2026-03-27 16:59:21 +01:00
-
c9364c4ec2
docs: initial design and implementation plan
Ethanfel
2026-03-27 16:57:15 +01:00