Commit Graph

16 Commits

Author SHA1 Message Date
91e5bd8222 Clean up debug logging and fix precision setting for autocast
Remove all [STAR DEBUG] print statements added during quality
investigation. Fix autocast to actually use the selected precision
dtype (fp16/bf16) instead of always defaulting to fp16. fp32 now
properly disables autocast for full-precision inference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 03:13:33 +01:00
45e57f58a0 Print model load status to detect missing/unexpected weight keys
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:34:26 +01:00
0537d9d8a5 Expose denoise parameter (0.1–1.0) in node UI
Maps directly to total_noise_levels (denoise * 1000). Default 0.9 matches
the original STAR inference script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:06:57 +01:00
8a440761d1 Fix noise level (900 not 1000) and prompt concatenation to match original STAR
The original STAR inference uses total_noise_levels=900, preserving input
structure during SDEdit. We had 1000 which starts from near-pure noise,
destroying the input. Also always append the quality prompt to user text
instead of using it only as a fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 02:03:34 +01:00
2bf8db4f07 Use fp32 accumulation in SDPA and math attention to match xformers precision
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:47:10 +01:00
0508868978 Revert SDPA to 3D tensors — 4D unsqueeze caused quality degradation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:37:10 +01:00
f03c4853f1 Revert model loading to original HF-based paths
Reverts text encoder and VAE loading back to using HuggingFace preset
names / repo IDs (downloading to library cache) while keeping the
attention dispatcher improvements (4D SDPA, math backend).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:20:37 +01:00
4c6c38f05a Fix attention dispatcher: use 4D tensors for SDPA, add math backend
SDPA with 3D xformers-BMK tensors cannot use Flash Attention and falls
back to efficient_attention/math kernels that miscompute on Ada Lovelace
GPUs (e.g. RTX 6000 Pro), producing brownish line artifacts.  Unsqueeze
to 4D (1, B*H, N, D) so Flash Attention is eligible.  Also add a naive
"math" backend (chunked bmm) as a guaranteed-correct diagnostic baseline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 01:05:51 +01:00
f991f5cb02 Load text encoder and VAE from ComfyUI model folders
Download OpenCLIP ViT-H-14 to models/text_encoders/ and SVD temporal
VAE to models/vae/svd-temporal-vae/ instead of hidden library caches,
so they're visible, reusable, and shared with other nodes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:37:14 +01:00
e272e1a57d Fix open_clip batch_first compatibility via auto-applied patch
Newer open_clip creates nn.MultiheadAttention with batch_first=True,
but STAR's embedder unconditionally permutes to [seq, batch, embed].
This causes a RuntimeError in the text encoder (attn_mask shape
mismatch). The patch detects batch_first at runtime and only permutes
when needed.

Patches in patches/ are auto-applied to the STAR submodule on startup
and skip gracefully if already applied.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:25:26 +01:00
82d7f4997a Add configurable attention backend with SageAttention variant support
Replace the auto-detect xformers shim with a runtime dispatcher that
always intercepts xformers.ops.memory_efficient_attention. A new
dropdown on STARModelLoader (and --attention CLI arg) lets users
explicitly select: sdpa (default), xformers, sageattn, or specific
SageAttention kernels (fp16 triton/cuda, fp8 cuda). Only backends
that successfully import appear as options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:12:26 +01:00
cf74b587ec Add SageAttention as preferred attention backend when available
Attention fallback chain: SageAttention (2-5x faster, INT8
quantized) > xformers > PyTorch native SDPA. SageAttention is
optional — install with `pip install sageattention` for a speed
boost.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:00:55 +01:00
5de26d8ead Add xformers compatibility shim using PyTorch native SDPA
Avoids requiring xformers installation by shimming
xformers.ops.memory_efficient_attention with
torch.nn.functional.scaled_dot_product_attention when
xformers is not available.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:58:55 +01:00
5786ab6be7 Auto-initialize STAR submodule if missing on first load
Detects when the STAR submodule directory is empty (cloned without
--recursive) and runs git submodule update --init automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:48:43 +01:00
f7021e95f4 Add segment-based processing for long videos to reduce RAM usage
Process videos in overlapping segments (25% overlap with linear crossfade
blending) so peak memory is bounded by one segment rather than the full
video. New segment_size parameter on the Super-Resolution node (default 0
= all at once, recommended 16-32 for long videos). Also update README
clone URL to GitHub mirror.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:28:01 +01:00
5f9287cfac Initial release: ComfyUI nodes for STAR video super-resolution
Two-node package wrapping the STAR (ICCV 2025) diffusion-based video
upscaling pipeline:

- STAR Model Loader: loads UNet+ControlNet, OpenCLIP text encoder, and
  temporal VAE with auto-download from HuggingFace
- STAR Video Super-Resolution: runs the full diffusion pipeline with
  configurable upscale factor, guidance, solver mode, chunking, and
  color correction

Includes three VRAM offload modes (disabled/model/aggressive) to
support GPUs from 12GB to 40GB+.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 23:20:27 +01:00