ComfyUI-SelVA

Author	SHA1	Message	Date
Ethanfel	f99d2666e8	fix: interpolate sync_cond to match audio sequence length in transformer Sync_MLP interpolates sync features based on video duration, but audio latent length depends on the user-set audio duration. When video != audio duration, the sequences diverge. Resample sync_cond to x's length before the gated addition so any video/audio duration combo works. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 21:21:39 +01:00
Ethanfel	9b1cb71b2a	fix: remove MMDiTWrapper import and dead code paths from factory.py MMDiTWrapper was removed from diffusion.py during cleanup but the import in factory.py was missed, causing ImportError on every model load. Also stub wavelet and diffusion_prior paths that reference deleted modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 19:12:40 +01:00
Ethanfel	8b634923dd	fix: remove unused tqdm import from sampling.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 18:01:29 +01:00
Ethanfel	87bea21d49	feat: extract prismaudio_core inference with callback-enabled sampling Add inference subpackage with: - sampling.py: sample_discrete_euler modified from upstream to add callback parameter for ComfyUI progress bars (uses enumerate for step index) - utils.py: set_audio_channels and prepare_audio for audio preprocessing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-27 17:59:37 +01:00
Ethanfel	30e85f0f99	fix: resolve critical bugs and quality issues in prismaudio_core/models Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 17:56:02 +01:00
Ethanfel	6e1186d5bd	fix: clean up dead code paths and debug artifacts in prismaudio_core/models Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 17:49:57 +01:00
Ethanfel	84c81e0e55	feat: extract prismaudio_core model modules (DiT, conditioners, VAE, diffusion) Fetch and adapt inference-critical model modules from upstream PrismAudio repo: - dit.py: DiffusionTransformer with debug prints removed - diffusion.py: ConditionedDiffusionModelWrapper, DiTWrapper, MMDiTWrapper - conditioners.py: Cond_MLP, Sync_MLP, MultiConditioner with stubbed training imports - autoencoders.py: AudioAutoencoder, OobleckEncoder/Decoder - transformer.py: ContinuousTransformer, Attention with flash_attn fallback to SDPA - blocks.py, utils.py, bottleneck.py, pretransforms.py, local_attention.py, pqmf.py - adp.py: UNetCFG1d, UNet1d, NumberEmbedder - mmmodules/model/low_level.py: MLP, ChannelLastConv1d, ConvMLP All internal imports rewritten from PrismAudio.* to prismaudio_core.*, training-only imports stubbed, flash_attn made optional with HAS_FLASH_ATTN flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 17:31:22 +01:00
Ethanfel	b60ff4111b	feat: extract prismaudio_core config and model factory Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-27 17:05:57 +01:00

8 Commits