ComfyUI-Tween

Author	SHA1	Message	Date
Ethanfel	76dff7e573	Fix FlashVSR quality: two-stage temporal padding, kv_ratio=3, float64 precision Root cause of remaining ghosting: our single-stage temporal padding (N+4 → floor to 8k+1) TRUNCATED frames when N+4 wasn't already 8k+1. For 50 frames: 50+4=54 → floor to 49, LOSING the last input frame. The pipeline then processed misaligned LQ→output frame mapping. Fix matches naxci1/ComfyUI-FlashVSR_Stable two-stage approach: 1. Pad to next_8n5(N) (next integer >= N of form 8k+5, minimum 21) 2. Add 4 → result is always 8(k+1)+1, a valid 8k+1 — NEVER truncates Also: - kv_ratio default 2.0→3.0 (matches naxci1, max quality KV cache) - local_range default 9→11 (more stable temporal consistency) - sinusoidal_embedding_1d, precompute_freqs_cis, rope_apply: float32→float64 (matches naxci1 reference precision for embeddings and RoPE) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 18:06:46 +01:00
Ethanfel	fa250897a2	Fix FlashVSR ghosting: streaming TCDecoder decode + Causal LQ projection Root cause: three critical differences from naxci1 reference implementation: 1. Batch decode after loop → streaming per-chunk TCDecoder decode with LQ conditioning inside the loop. The TCDecoder uses causal convolutions with temporal memory that must be built incrementally per-chunk. Batch decode breaks this design and loses LQ frame conditioning, causing ghosting. 2. Buffer_LQ4x_Proj → Causal_LQ4x_Proj for FlashVSR v1.1. The causal variant reads the OLD cache before writing the new one (truly causal), while Buffer writes cache BEFORE the conv call. Using the wrong variant misaligns temporal LQ conditioning features. 3. Temporal padding formula: changed from round-up to largest_8n1_leq(N+4) matching the naxci1 reference approach. Changes: - flashvsr_full.py: streaming TCDecoder decode per-chunk with LQ conditioning and per-chunk color correction (was: batch VAE decode after loop) - flashvsr_tiny.py: streaming TCDecoder decode per-chunk (was: batch decode) - inference.py: use Causal_LQ4x_Proj, build TCDecoder for ALL modes (including full), fix temporal padding to largest_8n1_leq(N+4), clear TCDecoder in clear_caches() - utils.py: add Causal_LQ4x_Proj class - nodes.py: update progress bar estimation for new padding formula Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 17:42:20 +01:00
Ethanfel	94d9818675	Fix FlashVSR quality: match naxci1 reference preprocessing - Remove front dummy frames (not used by reference implementation) - Use centered reflect padding instead of right/bottom replicate - Crop output from center matching padding offsets - Simplify temporal padding to 8k+1 alignment - Update progress bar estimation to match new formula Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 17:10:12 +01:00
Ethanfel	ea84ffef7c	Fix FlashVSR ghosting: restore 2 front dummy frames matching reference The pipeline's LQ conditioning indexing expects 2 front dummy frames (copies of first frame) as warmup. Our previous refactoring removed these, shifting all LQ conditioning by 2 frames and causing severe ghosting artifacts. Now matches the 1038lab reference preprocessing exactly: 1. _prepare_video: 2 tail copies + alignment + 2 front dummies + back padding 2. _restore_video_sequence: strip first 2 warmup frames + trim to original count 3. Crop pipeline output to padded_n before restoration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 16:49:46 +01:00
Ethanfel	4cc6e9c705	Remove debug logging from FlashVSR SegmentUpscale Issue was a workflow wiring mistake, not a code bug. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 16:32:23 +01:00
Ethanfel	39d0f7af42	Add debug logging for FlashVSR SegmentUpscale output shapes Helps diagnose issue where segment 1+ runs but produces no image output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 16:31:09 +01:00
Ethanfel	11e2acb9e0	Fix FlashVSR frame padding to match pipeline requirements The pipeline requires num_frames % 4 == 1. Our old _pad_video_5d used a wrong formula that produced non-conforming counts (e.g. 33 input → 35 padded → pipeline rounds to 37, wasting VRAM). New padding uses num_frames % 8 == 1 (also satisfies % 4 == 1), which ensures the streaming loop output exactly matches num_frames with zero waste. Optimal input counts: 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105. Also removes incorrect 2-frame warmup stripping from _restore_video_sequence — the pipeline output doesn't have warmup artifacts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 16:20:02 +01:00
Ethanfel	8317a0603e	Reuse FlashVSR models from 1038lab node if already downloaded Check models/FlashVSR/ (1038lab convention) before models/flashvsr/ to avoid downloading ~7GB of checkpoints twice. Only create the directory when actually downloading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:42:10 +01:00
Ethanfel	0fecfcee37	Add FlashVSR support: diffusion-based 4x video super-resolution (Wan 2.1-1.3B) Vendor minimal diffsynth subset for FlashVSR inference (full/tiny pipelines, v1 and v1.1 checkpoints auto-downloaded from HuggingFace). Includes segment-based processing with temporal overlap and crossfade blending for bounded RAM on long videos. Nodes: Load FlashVSR Model, FlashVSR Upscale, FlashVSR Segment Upscale. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 15:12:33 +01:00
Ethanfel	d642255e70	Add GIMM-VFI support (NeurIPS 2024) with single-pass arbitrary-timestep interpolation Integrates GIMM-VFI alongside existing BIM/EMA/SGM models. Key feature: generates all intermediate frames in one forward pass (no recursive 2x passes needed for 4x/8x). - Vendor gimm_vfi_arch/ from kijai/ComfyUI-GIMM-VFI with device fixes - Two variants: RAFT-based (~80MB) and FlowFormer-based (~123MB) - Auto-download checkpoints from HuggingFace (Kijai/GIMM-VFI_safetensors) - Three new nodes: Load GIMM-VFI Model, GIMM-VFI Interpolate, GIMM-VFI Segment Interpolate - single_pass toggle: True=arbitrary timestep (default), False=recursive like other models - ds_factor parameter for high-res input downscaling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:11:45 +01:00
Ethanfel	769da2586e	Fix SGM-VFI auto-download: correct file extension .pth → .pkl The Google Drive folder contains .pkl files but the default model name used .pth, causing the post-download existence check to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:45:13 +01:00
Ethanfel	d935462e24	Support relative paths and robust preview in TweenConcatVideos Relative output_directory values are now resolved against ComfyUI's output directory. Video preview is skipped with a warning when the output path is outside the output tree (frontend can't serve it). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:30:48 +01:00
Ethanfel	aa85f523f2	Add optional video preview to TweenConcatVideos Uses ComfyUI's ui/gifs return dict to show the concatenated video directly on the node, matching the VHS Video Combine pattern. Togglable via a preview boolean input (default True). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:16:01 +01:00
Ethanfel	fc4efb8b17	Rename BIMVFIConcatVideos to TweenConcatVideos The concat node is model-agnostic (just joins video segments via ffmpeg), so it shouldn't be under BIM-VFI. Now accepts any model type as the dependency input and lives under the video/Tween category. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:12:16 +01:00
Ethanfel	e37cc3dd2e	Rename project to ComfyUI-Tween Update logger names, install prefixes, README clone instructions, and error messages to reflect the new repo name. Model-specific node names and categories (BIM-VFI, EMA-VFI, SGM-VFI) are unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:08:54 +01:00
Ethanfel	42ebdd8b96	Add SGM-VFI (CVPR 2024) frame interpolation support SGM-VFI combines local flow estimation with sparse global matching (GMFlow) to handle large motion and occlusion-heavy scenes. Adds 3 new nodes: Load SGM-VFI Model, SGM-VFI Interpolate, SGM-VFI Segment Interpolate. Architecture files vendored from MCG-NJU/SGM-VFI with device-awareness fixes (no hardcoded .cuda()), relative imports, and debug code removed. README updated with model comparison table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 23:02:48 +01:00
Ethanfel	1de086569c	Add EMA-VFI (CVPR 2023) frame interpolation support Integrate EMA-VFI alongside existing BIM-VFI with three new ComfyUI nodes: Load EMA-VFI Model, EMA-VFI Interpolate, and EMA-VFI Segment Interpolate. Architecture files vendored from MCG-NJU/EMA-VFI with device-awareness fixes (removed hardcoded .cuda() calls), warp cache management, and relative imports. InputPadder extended to support EMA-VFI's replicate center-symmetric padding. Auto-installs timm dependency on first load. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 22:30:06 +01:00
Ethanfel	0133f61d47	Add delete_segments option to Concat Videos node Cleans up individual segment files after successful concatenation, preventing leftover files from polluting subsequent runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 21:11:55 +01:00
Ethanfel	98c558b1b0	Add BIM-VFI Concat Videos node for joining segment outputs Adds a new node that concatenates segment video files (produced by VHS Video Combine) into a single video using ffmpeg's concat demuxer with -c copy (no re-encoding). The model input acts as a sequencing signal to ensure all segments finish before concatenation begins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 21:06:52 +01:00
Ethanfel	7cf7162143	Add BIM-VFI Segment Interpolate node for bounded peak RAM Processes numbered segments of the input batch so users can chain multiple instances with Save nodes between them, freeing each segment's output before the next starts. Model pass-through output forces sequential execution via ComfyUI's dependency graph. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 20:13:02 +01:00
Ethanfel	3e8148b7e2	Add chunk_size for long video support, fix cache clearing, add README - chunk_size input splits input into overlapping segments processed independently then stitched, bounding memory for 1000+ frame videos while producing identical results to processing all at once - Fix cache clearing logic: use counter instead of modulo so it triggers regardless of batch_size value - Replace inefficient torch.cat gather with direct tensor slicing - Add README with usage guide, VRAM recommendations, and full attribution to BiM-VFI (Seo, Oh, Kim — CVPR 2025, KAIST VIC Lab) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 19:08:42 +01:00
Ethanfel	993a3a72b1	Add batch processing support for faster frame interpolation Processes multiple frame pairs simultaneously instead of one-by-one. New batch_size input (1-64) lets users trade VRAM for speed. Refactored pyr_level logic into shared _get_pyr_level() helper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 18:54:40 +01:00
Ethanfel	69a4aebfe7	Add auto_pyr_level toggle to select pyramid level by resolution When enabled (default), automatically picks the optimal pyr_level based on input height: <540p=3, 540p=5, 1080p=6, 4K=7. When disabled, uses the manual pyr_level value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 18:51:29 +01:00
Ethanfel	4e6f9eb896	Respect user's pyr_level setting at all resolutions Previously the user's pyr_level was overridden for >=540p content. Now the setting is always used, with the tooltip recommending values per resolution instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 18:50:29 +01:00
Ethanfel	ffde07a89a	Add tooltips to all node inputs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 18:49:27 +01:00
Ethanfel	db64fc195a	Initial commit: ComfyUI BIM-VFI node for video frame interpolation Wraps BiM-VFI (CVPR 2025) as a ComfyUI custom node for long video frame interpolation with memory-safe sequential processing. - LoadBIMVFIModel: checkpoint loader with auto-download from Google Drive - BIMVFIInterpolate: 2x/4x/8x recursive interpolation with per-pair GPU processing, configurable VRAM management (all_on_gpu for high-VRAM setups), progress bar, and backwarp cache clearing - Vendored inference-only architecture from KAIST-VICLab/BiM-VFI - Auto-detection of CUDA version for cupy installation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 18:26:49 +01:00

26 Commits