Sync_MLP interpolates sync features based on video duration, but audio
latent length depends on the user-set audio duration. When video != audio
duration, the sequences diverge. Resample sync_cond to x's length before
the gated addition so any video/audio duration combo works.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MMDiTWrapper was removed from diffusion.py during cleanup but the import
in factory.py was missed, causing ImportError on every model load.
Also stub wavelet and diffusion_prior paths that reference deleted modules.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>