Commit Graph

9 Commits

Author SHA1 Message Date
4c6c38f05a Fix attention dispatcher: use 4D tensors for SDPA, add math backend
SDPA with 3D xformers-BMK tensors cannot use Flash Attention and falls
back to efficient_attention/math kernels that miscompute on Ada Lovelace
GPUs (e.g. RTX 6000 Pro), producing brownish line artifacts.  Unsqueeze
to 4D (1, B*H, N, D) so Flash Attention is eligible.  Also add a naive
"math" backend (chunked bmm) as a guaranteed-correct diagnostic baseline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:05:51 +01:00
f991f5cb02 Load text encoder and VAE from ComfyUI model folders
Download OpenCLIP ViT-H-14 to models/text_encoders/ and SVD temporal
VAE to models/vae/svd-temporal-vae/ instead of hidden library caches,
so they're visible, reusable, and shared with other nodes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:37:14 +01:00
e272e1a57d Fix open_clip batch_first compatibility via auto-applied patch
Newer open_clip creates nn.MultiheadAttention with batch_first=True,
but STAR's embedder unconditionally permutes to [seq, batch, embed].
This causes a RuntimeError in the text encoder (attn_mask shape
mismatch). The patch detects batch_first at runtime and only permutes
when needed.

Patches in patches/ are auto-applied to the STAR submodule on startup
and skip gracefully if already applied.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:25:26 +01:00
82d7f4997a Add configurable attention backend with SageAttention variant support
Replace the auto-detect xformers shim with a runtime dispatcher that
always intercepts xformers.ops.memory_efficient_attention. A new
dropdown on STARModelLoader (and --attention CLI arg) lets users
explicitly select: sdpa (default), xformers, sageattn, or specific
SageAttention kernels (fp16 triton/cuda, fp8 cuda). Only backends
that successfully import appear as options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:12:26 +01:00
cf74b587ec Add SageAttention as preferred attention backend when available
Attention fallback chain: SageAttention (2-5x faster, INT8
quantized) > xformers > PyTorch native SDPA. SageAttention is
optional — install with `pip install sageattention` for a speed
boost.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:00:55 +01:00
5de26d8ead Add xformers compatibility shim using PyTorch native SDPA
Avoids requiring xformers installation by shimming
xformers.ops.memory_efficient_attention with
torch.nn.functional.scaled_dot_product_attention when
xformers is not available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:58:55 +01:00
5786ab6be7 Auto-initialize STAR submodule if missing on first load
Detects when the STAR submodule directory is empty (cloned without
--recursive) and runs git submodule update --init automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:48:43 +01:00
f7021e95f4 Add segment-based processing for long videos to reduce RAM usage
Process videos in overlapping segments (25% overlap with linear crossfade
blending) so peak memory is bounded by one segment rather than the full
video. New segment_size parameter on the Super-Resolution node (default 0
= all at once, recommended 16-32 for long videos). Also update README
clone URL to GitHub mirror.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:28:01 +01:00
5f9287cfac Initial release: ComfyUI nodes for STAR video super-resolution
Two-node package wrapping the STAR (ICCV 2025) diffusion-based video
upscaling pipeline:

- STAR Model Loader: loads UNet+ControlNet, OpenCLIP text encoder, and
  temporal VAE with auto-download from HuggingFace
- STAR Video Super-Resolution: runs the full diffusion pipeline with
  configurable upscale factor, guidance, solver mode, chunking, and
  color correction

Includes three VRAM offload modes (disabled/model/aggressive) to
support GPUs from 12GB to 40GB+.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:20:27 +01:00