Commit Graph

8 Commits

Author SHA1 Message Date
Ethanfel abd315092b feat: auto-use video duration from features when duration=0
Setting duration to 0 in PrismAudioSampler now reads the duration
stored in the PRISMAUDIO_FEATURES dict (set by the feature extractor).
Default changed from 10.0 to 0.0 so V2A workflows are wired up
automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 11:00:47 +01:00
Ethanfel c38df8c6fa chore: remove debug options and diagnostic logging
Remove debug_zero_video/debug_zero_sync inputs from PrismAudioSampler,
DIT velocity diagnostics, conditioner stats logging, and feature stats
prints from both sampler.py and text_only.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 10:47:00 +01:00
Ethanfel 1d8b9b59e0 debug: add DIT velocity diagnostic at t=1 to isolate DIT vs VAE quality issue
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 23:57:03 +01:00
Ethanfel 8bf4a0c3fc debug: log conditioner output stats and T2A text feature stats
Add per-key conditioning output stats (after Cond_MLP/Sync_MLP, after
_substitute_empty_features) to both sampler and text_only nodes. Also
add raw T5 text feature stats in T2A before conditioning.

This lets us directly compare:
- T2A vs V2A conditioning outputs to find which path differs
- T2A vs npz text feature ranges

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 22:39:44 +01:00
Ethanfel 477fe0f08f debug: add latent and audio stats logging to V2A sampler
Match the diagnostic output already in text_only.py to compare
V2A vs T2A latent distributions and diagnose conditioning issues.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 22:28:08 +01:00
Ethanfel c0b7ccbcee fix: substitute empty_clip_feat for video features when no video present
Zero features through bias-free Cond_MLP produce near-zero activations,
not the learned null signal the model was trained with. Use empty_clip_feat
(the learned null video embedding) just like empty_sync_feat for sync.
Also improve text_prompt tooltip to encourage detailed CoT descriptions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 22:13:22 +01:00
Ethanfel 83a7f2787b feat: add debug_zero_video/sync toggles and feature stats logging to sampler
Allows isolating which feature set causes quality issues:
- debug_zero_video: zero video_features → text+sync only
- debug_zero_sync: zero sync_features → text+video only
Also logs mean/std/shape for all three feature tensors on every run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 21:40:34 +01:00
Ethanfel 3d62688e8c feat: PrismAudioSampler node with correct metadata format and peak normalization
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 18:07:33 +01:00