Files
Comfyui-STAR/README.md
Ethanfel f7021e95f4 Add segment-based processing for long videos to reduce RAM usage
Process videos in overlapping segments (25% overlap with linear crossfade
blending) so peak memory is bounded by one segment rather than the full
video. New segment_size parameter on the Super-Resolution node (default 0
= all at once, recommended 16-32 for long videos). Also update README
clone URL to GitHub mirror.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:28:01 +01:00

90 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ComfyUI-STAR
ComfyUI custom nodes for [STAR (Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution)](https://github.com/NJU-PCALab/STAR) — a diffusion-based video upscaling model (ICCV 2025).
## Features
- **Diffusion-based 4x video super-resolution** with temporal coherence
- **Two model variants**: `light_deg.pt` (light degradation) and `heavy_deg.pt` (heavy degradation)
- **Auto-download**: all models (UNet checkpoint, OpenCLIP text encoder, temporal VAE) download automatically on first use
- **VRAM offloading**: three modes to fit GPUs from 12GB to 40GB+
- **Long video support**: sliding-window chunking with 50% overlap
- **Color correction**: AdaIN and wavelet-based post-processing
## Installation
### ComfyUI Manager
Search for `ComfyUI-STAR` in ComfyUI Manager and install.
### Manual
```bash
cd ComfyUI/custom_nodes
git clone --recursive https://github.com/ethanfel/Comfyui-STAR.git
cd Comfyui-STAR
pip install -r requirements.txt
```
> The `--recursive` flag clones the STAR submodule. If you forgot it, run `git submodule update --init` afterwards.
## Nodes
### STAR Model Loader
Loads the STAR model components (UNet+ControlNet, OpenCLIP text encoder, temporal VAE).
| Input | Description |
|-------|-------------|
| **model_name** | `light_deg.pt` for mildly degraded video, `heavy_deg.pt` for heavily degraded video. Auto-downloaded from HuggingFace on first use. |
| **precision** | `fp16` (recommended), `bf16`, or `fp32`. |
| **offload** | `disabled` (~39GB VRAM), `model` (~16GB — swaps components to CPU when idle), `aggressive` (~12GB — model offload + single-frame VAE decode). |
### STAR Video Super-Resolution
Runs the STAR diffusion pipeline on an image batch.
| Input | Description |
|-------|-------------|
| **star_model** | Connect from STAR Model Loader. |
| **images** | Input video frames (IMAGE batch). |
| **upscale** | Upscale factor (18, default 4). |
| **steps** | Denoising steps (1100, default 15). Ignored in `fast` mode. |
| **guide_scale** | Classifier-free guidance scale (120, default 7.5). |
| **prompt** | Text prompt. Leave empty for STAR's built-in quality prompt. |
| **solver_mode** | `fast` (optimized 15-step schedule) or `normal` (uniform schedule). |
| **max_chunk_len** | Max frames per chunk (4128, default 32). Lower = less VRAM for long videos. |
| **seed** | Random seed for reproducibility. |
| **color_fix** | `adain` (match color stats), `wavelet` (preserve low-frequency color), or `none`. |
| **segment_size** | Process video in segments of this many frames to reduce RAM usage (0256, default 0). 0 = process all at once. Recommended: 1632 for long videos. Segments overlap by 25% with linear crossfade blending. |
## VRAM Requirements
| Offload Mode | Approximate VRAM | Notes |
|---|---|---|
| disabled | ~39 GB | Fastest — everything on GPU |
| model | ~16 GB | Components swap to CPU between stages |
| aggressive | ~12 GB | Model offload + frame-by-frame VAE decode |
Reducing `max_chunk_len` further lowers VRAM usage for long videos at the cost of slightly more processing time.
## Model Weights
Models are stored in `ComfyUI/models/star/` and auto-downloaded on first use:
| Model | Use Case | Source |
|-------|----------|--------|
| `light_deg.pt` | Low-res video from the web, mild compression | [HuggingFace](https://huggingface.co/SherryX/STAR/resolve/main/I2VGen-XL-based/light_deg.pt) |
| `heavy_deg.pt` | Heavily compressed/degraded video | [HuggingFace](https://huggingface.co/SherryX/STAR/resolve/main/I2VGen-XL-based/heavy_deg.pt) |
The OpenCLIP text encoder and SVD temporal VAE are downloaded automatically by their respective libraries on first load.
## Credits
- [STAR](https://github.com/NJU-PCALab/STAR) by Rui Xie, Yinhong Liu et al. (Nanjing University) — ICCV 2025
- Based on [I2VGen-XL](https://github.com/ali-vilab/VGen) and [VEnhancer](https://github.com/Vchitect/VEnhancer)
## License
This wrapper is MIT licensed. The STAR model weights follow their respective licenses (MIT for I2VGen-XL-based models).