Ethanfel 1de086569c Add EMA-VFI (CVPR 2023) frame interpolation support
Integrate EMA-VFI alongside existing BIM-VFI with three new ComfyUI nodes:
Load EMA-VFI Model, EMA-VFI Interpolate, and EMA-VFI Segment Interpolate.

Architecture files vendored from MCG-NJU/EMA-VFI with device-awareness
fixes (removed hardcoded .cuda() calls), warp cache management, and
relative imports. InputPadder extended to support EMA-VFI's replicate
center-symmetric padding. Auto-installs timm dependency on first load.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 22:30:06 +01:00

ComfyUI BIM-VFI + EMA-VFI

ComfyUI custom nodes for video frame interpolation using BiM-VFI (CVPR 2025) and EMA-VFI (CVPR 2023). Designed for long videos with thousands of frames — processes them without running out of VRAM.

Nodes

BIM-VFI

Load BIM-VFI Model

Loads the BiM-VFI checkpoint. Auto-downloads from Google Drive on first use to ComfyUI/models/bim-vfi/.

Input Description
model_path Checkpoint file from models/bim-vfi/
auto_pyr_level Auto-select pyramid level by resolution (<540p=3, 540p=5, 1080p=6, 4K=7)
pyr_level Manual pyramid level (3-7), only used when auto is off

BIM-VFI Interpolate

Interpolates frames from an image batch.

Input Description
images Input image batch
model Model from the loader node
multiplier 2x, 4x, or 8x frame rate (recursive 2x passes)
batch_size Frame pairs processed simultaneously (higher = faster, more VRAM)
chunk_size Process in segments of N input frames (0 = disabled). Bounds VRAM for very long videos. Result is identical to processing all at once
keep_device Keep model on GPU between pairs (faster, ~200MB constant VRAM)
all_on_gpu Keep all intermediate frames on GPU (fast, needs large VRAM)
clear_cache_after_n_frames Clear CUDA cache every N pairs to prevent VRAM buildup

BIM-VFI Segment Interpolate

Same as Interpolate but processes a single segment of the input. Chain multiple instances with Save nodes between them to bound peak RAM. The model pass-through output forces sequential execution.

BIM-VFI Concat Videos

Concatenates segment video files into a single video using ffmpeg. Connect from the last Segment Interpolate's model output to ensure it runs after all segments are saved.

EMA-VFI

Load EMA-VFI Model

Loads an EMA-VFI checkpoint. Auto-downloads from Google Drive on first use to ComfyUI/models/ema-vfi/. Variant (large/small) and timestep support are auto-detected from the filename.

Input Description
model_path Checkpoint file from models/ema-vfi/
tta Test-time augmentation: flip input and average with unflipped result (~2x slower, slightly better quality)

Available checkpoints:

Checkpoint Variant Params Arbitrary timestep
ours_t.pkl Large ~65M Yes
ours.pkl Large ~65M No (fixed 0.5)
ours_small_t.pkl Small ~14M Yes
ours_small.pkl Small ~14M No (fixed 0.5)

EMA-VFI Interpolate

Interpolates frames from an image batch. Same controls as BIM-VFI Interpolate.

EMA-VFI Segment Interpolate

Same as EMA-VFI Interpolate but processes a single segment. Same pattern as BIM-VFI Segment Interpolate.

Output frame count (both models): 2x = 2N-1, 4x = 4N-3, 8x = 8N-7

Installation

Clone into your ComfyUI custom_nodes/ directory:

cd ComfyUI/custom_nodes
git clone https://github.com/your-user/Comfyui-BIM-VFI.git

Dependencies (gdown, cupy, timm) are auto-installed on first load. The correct cupy variant is detected from your PyTorch CUDA version.

Warning: cupy is a large package (~800MB) and compilation/installation can take several minutes. The first ComfyUI startup after installing this node may appear to hang while cupy installs in the background. Check the console log for progress. If auto-install fails (e.g. missing build tools in Docker), install manually with:

pip install cupy-cuda12x  # replace 12 with your CUDA major version

To install manually:

cd Comfyui-BIM-VFI
python install.py

Requirements

  • PyTorch with CUDA
  • cupy (matching your CUDA version, for BIM-VFI)
  • timm (for EMA-VFI)
  • gdown (for model auto-download)

VRAM Guide

VRAM Recommended settings
8 GB batch_size=1, chunk_size=500
24 GB batch_size=2-4, chunk_size=1000
48 GB+ batch_size=4-16, all_on_gpu=true
96 GB+ batch_size=8-16, all_on_gpu=true, chunk_size=0

Acknowledgments

This project wraps the official BiM-VFI implementation by the KAIST VIC Lab and the official EMA-VFI implementation by MCG-NJU. Architecture files in bim_vfi_arch/ and ema_vfi_arch/ are vendored from their respective repositories with minimal modifications (relative imports, device-awareness fixes, inference-only paths).

BiM-VFI:

Wonyong Seo, Jihyong Oh, and Munchurl Kim. "BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions." IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. [arXiv] [Project Page] [GitHub]

@inproceedings{seo2025bimvfi,
  title={BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions},
  author={Seo, Wonyong and Oh, Jihyong and Kim, Munchurl},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2025}
}

EMA-VFI:

Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, and Limin Wang. "Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation." IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. [arXiv] [GitHub]

@inproceedings{zhang2023emavfi,
  title={Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation},
  author={Zhang, Guozhen and Zhu, Yuhan and Wang, Haonan and Chen, Youxin and Wu, Gangshan and Wang, Limin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023}
}

License

The BiM-VFI model weights and architecture code are provided by KAIST VIC Lab for research and education purposes only. Commercial use requires permission from the principal investigator (Prof. Munchurl Kim, mkimee@kaist.ac.kr). See the original repository for details.

The EMA-VFI model weights and architecture code are released under the Apache 2.0 License. See the original repository for details.

Description
No description provided
Readme 944 KiB
Languages
Python 100%