Ethanfel c178f756da Expand cupy install guide in README
Add step-by-step instructions, CUDA version table, troubleshooting
section, and note that EMA-VFI works without cupy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 13:25:36 +01:00
2026-02-13 22:55:30 +01:00
2026-02-12 23:08:54 +01:00
2026-02-27 13:25:36 +01:00

ComfyUI BIM-VFI + EMA-VFI + SGM-VFI + GIMM-VFI

ComfyUI custom nodes for video frame interpolation using BiM-VFI (CVPR 2025), EMA-VFI (CVPR 2023), SGM-VFI (CVPR 2024), and GIMM-VFI (NeurIPS 2024). Designed for long videos with thousands of frames — processes them without running out of VRAM.

Which model should I use?

BIM-VFI EMA-VFI SGM-VFI GIMM-VFI
Best for General-purpose, non-uniform motion Fast inference, light VRAM Large motion, occlusion-heavy scenes High multipliers (4x/8x) in a single pass
Quality Highest overall Good Best on large motion Good
Speed Moderate Fastest Slowest Fast for 4x/8x (single pass)
VRAM ~2 GB/pair ~1.5 GB/pair ~3 GB/pair ~2.5 GB/pair
Params ~17M ~1465M ~15M + GMFlow ~80M (RAFT) / ~123M (FlowFormer)
Arbitrary timestep Yes Yes (with _t checkpoint) No (fixed 0.5) Yes (native single-pass)
4x/8x mode Recursive 2x passes Recursive 2x passes Recursive 2x passes Single forward pass (or recursive)
Paper CVPR 2025 CVPR 2023 CVPR 2024 NeurIPS 2024
License Research only Apache 2.0 Apache 2.0 Apache 2.0

TL;DR: Start with BIM-VFI for best quality. Use EMA-VFI if you need speed or lower VRAM. Use SGM-VFI if your video has large camera motion or fast-moving objects that the others struggle with. Use GIMM-VFI when you want 4x or 8x interpolation without recursive passes — it generates all intermediate frames in a single forward pass per pair.

Nodes

BIM-VFI

Load BIM-VFI Model

Loads the BiM-VFI checkpoint. Auto-downloads from Google Drive on first use to ComfyUI/models/bim-vfi/.

Input Description
model_path Checkpoint file from models/bim-vfi/
auto_pyr_level Auto-select pyramid level by resolution (<540p=3, 540p=5, 1080p=6, 4K=7)
pyr_level Manual pyramid level (3-7), only used when auto is off

BIM-VFI Interpolate

Interpolates frames from an image batch.

Input Description
images Input image batch
model Model from the loader node
multiplier 2x, 4x, or 8x frame rate (recursive 2x passes)
batch_size Frame pairs processed simultaneously (higher = faster, more VRAM)
chunk_size Process in segments of N input frames (0 = disabled). Bounds VRAM for very long videos. Result is identical to processing all at once
keep_device Keep model on GPU between pairs (faster, ~200MB constant VRAM)
all_on_gpu Keep all intermediate frames on GPU (fast, needs large VRAM)
clear_cache_after_n_frames Clear CUDA cache every N pairs to prevent VRAM buildup
source_fps Input frame rate. Required when target_fps > 0
target_fps Target output FPS. When > 0, overrides multiplier — auto-computes the optimal power-of-2 oversample then selects frames at exact target timestamps. 0 = use multiplier
Output Description
images Interpolated frames at the target FPS (or at the multiplied rate when target_fps = 0)
oversampled Full power-of-2 oversampled frames before target FPS selection. Same as images when target_fps = 0. Useful for inspecting the raw interpolation or feeding into another pipeline

BIM-VFI Segment Interpolate

Same as Interpolate but processes a single segment of the input. Chain multiple instances with Save nodes between them to bound peak RAM. The model pass-through output forces sequential execution.

Tween Concat Videos

Concatenates segment video files into a single video using ffmpeg. Connect from any Segment Interpolate's model output to ensure it runs after all segments are saved. Works with all four models.

EMA-VFI

Load EMA-VFI Model

Loads an EMA-VFI checkpoint. Auto-downloads from Google Drive on first use to ComfyUI/models/ema-vfi/. Variant (large/small) and timestep support are auto-detected from the filename.

Input Description
model_path Checkpoint file from models/ema-vfi/
tta Test-time augmentation: flip input and average with unflipped result (~2x slower, slightly better quality)

Available checkpoints:

Checkpoint Variant Params Arbitrary timestep
ours_t.pkl Large ~65M Yes
ours.pkl Large ~65M No (fixed 0.5)
ours_small_t.pkl Small ~14M Yes
ours_small.pkl Small ~14M No (fixed 0.5)

EMA-VFI Interpolate

Interpolates frames from an image batch. Same controls as BIM-VFI Interpolate (including target FPS mode).

EMA-VFI Segment Interpolate

Same as EMA-VFI Interpolate but processes a single segment. Same pattern as BIM-VFI Segment Interpolate.

SGM-VFI

Load SGM-VFI Model

Loads an SGM-VFI checkpoint. Auto-downloads from Google Drive on first use to ComfyUI/models/sgm-vfi/. Variant (base/small) is auto-detected from the filename (default is small).

Input Description
model_path Checkpoint file from models/sgm-vfi/
tta Test-time augmentation: flip input and average with unflipped result (~2x slower, slightly better quality)
num_key_points Sparsity of global matching (0.0 = global everywhere, 0.5 = default balance, higher = faster)

Available checkpoints:

Checkpoint Variant Params
ours-1-2-points.pkl Small ~15M + GMFlow

SGM-VFI Interpolate

Interpolates frames from an image batch. Same controls as BIM-VFI Interpolate (including target FPS mode).

SGM-VFI Segment Interpolate

Same as SGM-VFI Interpolate but processes a single segment. Same pattern as BIM-VFI Segment Interpolate.

GIMM-VFI

Load GIMM-VFI Model

Loads a GIMM-VFI checkpoint. Auto-downloads from HuggingFace on first use to ComfyUI/models/gimm-vfi/. The matching flow estimator (RAFT or FlowFormer) is auto-detected and downloaded alongside the main model.

Input Description
model_path Checkpoint file from models/gimm-vfi/
ds_factor Downscale factor for internal processing (1.0 = full res, 0.5 = half). Lower = less VRAM, faster, less quality. Try 0.5 for 4K inputs

Available checkpoints:

Checkpoint Variant Params Flow estimator (auto-downloaded)
gimmvfi_r_arb_lpips_fp32.safetensors RAFT ~80M raft-things_fp32.safetensors
gimmvfi_f_arb_lpips_fp32.safetensors FlowFormer ~123M flowformer_sintel_fp32.safetensors

GIMM-VFI Interpolate

Interpolates frames from an image batch. Same controls as BIM-VFI Interpolate (including target FPS mode), plus:

Input Description
single_pass When enabled (default), generates all intermediate frames per pair in one forward pass using GIMM-VFI's arbitrary-timestep capability. No recursive 2x passes needed for 4x or 8x. Disable to use the standard recursive approach (same as BIM/EMA/SGM)

GIMM-VFI Segment Interpolate

Same as GIMM-VFI Interpolate but processes a single segment. Same pattern as BIM-VFI Segment Interpolate.

Output frame count (all models):

  • Multiplier mode: 2x = 2N-1, 4x = 4N-3, 8x = 8N-7
  • Target FPS mode: floor((N-1) / source_fps * target_fps) + 1 frames. Automatically oversamples to the nearest power-of-2 above the ratio, then selects frames at exact target timestamps. Downsampling (target < source) also works — frames are selected from the input with no model calls

Installation

Install from the ComfyUI Registry or clone into your ComfyUI custom_nodes/ directory:

cd ComfyUI/custom_nodes
git clone https://github.com/Ethanfel/ComfyUI-Tween.git
pip install -r requirements.txt

cupy (required for BIM-VFI, SGM-VFI, GIMM-VFI)

cupy is a GPU-accelerated array library used for optical flow warping. It is required by BIM-VFI, SGM-VFI, and GIMM-VFI. EMA-VFI does not need cupy and works without it.

cupy must match your PyTorch CUDA version. If it is missing or mismatched, the Load node will show an error in ComfyUI with your CUDA version and the exact install command.

How to install cupy

Step 1 — Find your CUDA version:

python -c "import torch; print(torch.version.cuda)"

This prints something like 12.4 or 11.8.

Step 2 — Install the matching cupy package:

CUDA version Install command
12.x pip install cupy-cuda12x
11.x pip install cupy-cuda11x

Note: Make sure to run pip in the same Python environment as ComfyUI. If you use a venv or conda, activate it first.

Troubleshooting

Problem Solution
ModuleNotFoundError: No module named 'cupy' Install cupy using the steps above
cupy installed but ImportError at runtime CUDA version mismatch — uninstall with pip uninstall cupy-cuda12x and reinstall the correct version
Install hangs or takes very long cupy wheels are large (~800MB). Use a fast connection and be patient
Docker / no build tools Use the prebuilt wheel: pip install cupy-cuda12x (not cupy which compiles from source)

Can I skip cupy?

Yes — just use EMA-VFI, which does not require cupy. It is the fastest model and uses the least VRAM. The other three models (BIM-VFI, SGM-VFI, GIMM-VFI) will not load without cupy.

Other dependencies

All other dependencies (gdown, timm, omegaconf, easydict, yacs, einops, huggingface_hub) are listed in pyproject.toml and requirements.txt, and are installed automatically by ComfyUI Manager or pip.

VRAM Guide

VRAM Recommended settings
8 GB batch_size=1, chunk_size=500
24 GB batch_size=2-4, chunk_size=1000
48 GB+ batch_size=4-16, all_on_gpu=true
96 GB+ batch_size=8-16, all_on_gpu=true, chunk_size=0

Acknowledgments

This project wraps the official BiM-VFI implementation by the KAIST VIC Lab, the official EMA-VFI implementation by MCG-NJU, the official SGM-VFI implementation by MCG-NJU, and the GIMM-VFI implementation by S-Lab (NTU). GIMM-VFI architecture files in gimm_vfi_arch/ are adapted from kijai/ComfyUI-GIMM-VFI with safetensors checkpoints from Kijai/GIMM-VFI_safetensors. Architecture files in bim_vfi_arch/, ema_vfi_arch/, sgm_vfi_arch/, and gimm_vfi_arch/ are vendored from their respective repositories with minimal modifications (relative imports, device-awareness fixes, inference-only paths).

BiM-VFI:

Wonyong Seo, Jihyong Oh, and Munchurl Kim. "BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions." IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. [arXiv] [Project Page] [GitHub]

@inproceedings{seo2025bimvfi,
  title={BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions},
  author={Seo, Wonyong and Oh, Jihyong and Kim, Munchurl},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2025}
}

EMA-VFI:

Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, and Limin Wang. "Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation." IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. [arXiv] [GitHub]

@inproceedings{zhang2023emavfi,
  title={Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation},
  author={Zhang, Guozhen and Zhu, Yuhan and Wang, Haonan and Chen, Youxin and Wu, Gangshan and Wang, Limin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023}
}

SGM-VFI:

Guozhen Zhang, Yuhan Zhu, Evan Zheran Liu, Haonan Wang, Mingzhen Sun, Gangshan Wu, and Limin Wang. "Sparse Global Matching for Video Frame Interpolation with Large Motion." IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. [arXiv] [GitHub]

@inproceedings{zhang2024sgmvfi,
  title={Sparse Global Matching for Video Frame Interpolation with Large Motion},
  author={Zhang, Guozhen and Zhu, Yuhan and Liu, Evan Zheran and Wang, Haonan and Sun, Mingzhen and Wu, Gangshan and Wang, Limin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

GIMM-VFI:

Zujin Guo, Wei Li, and Chen Change Loy. "Generalizable Implicit Motion Modeling for Video Frame Interpolation." Advances in Neural Information Processing Systems (NeurIPS), 2024. [arXiv] [GitHub]

@inproceedings{guo2024gimmvfi,
  title={Generalizable Implicit Motion Modeling for Video Frame Interpolation},
  author={Guo, Zujin and Li, Wei and Loy, Chen Change},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2024}
}

License

The BiM-VFI model weights and architecture code are provided by KAIST VIC Lab for research and education purposes only. Commercial use requires permission from the principal investigator (Prof. Munchurl Kim, mkimee@kaist.ac.kr). See the original repository for details.

The EMA-VFI model weights and architecture code are released under the Apache 2.0 License. See the original repository for details.

The SGM-VFI model weights and architecture code are released under the Apache 2.0 License. See the original repository for details.

The GIMM-VFI model weights and architecture code are released under the Apache 2.0 License. See the original repository for details. ComfyUI adaptation based on kijai/ComfyUI-GIMM-VFI.

Description
No description provided
Readme 944 KiB
Languages
Python 100%