Commit Graph

8 Commits

Author SHA1 Message Date
2c50da025d Fix correction accumulation: store raw guidance in prev_eps
The paper stores corrected guidance, but in ComfyUI the corrections
compound through the sliding surface's lambda * prev_eps term
(amplified 4x per step at lambda=5). Over 20 steps this overwhelms
the actual guidance signal, causing total corruption at full K.

Storing raw (pre-correction) guidance keeps the surface tracking
the model's actual guidance evolution while applying fresh
corrections each step. This allows using full K=0.2 (matching
the paper) without accumulation-driven instability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:59:39 +01:00
3c92369305 Use full K strength matching paper, increase blur to 5x5
The K_eff = K/cond_scale compensation was making the correction
too weak at high CFG (0.017 at cfg=12 vs paper's 0.2). The original
paper uses full K and relies on cfg_scale * K amplification for
stabilization. tanh smoothing + 5x5 spatial blur handle artifact
prevention that the paper doesn't need in DiffSynth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:53:07 +01:00
8c88b3213c Add spatial smoothing to remove latent grid artifacts
The per-element correction creates a visible mesh pattern at the VAE's
8x8 patch boundaries. A 3x3 box blur in latent space (24x24 pixels)
smooths adjacent corrections while preserving the large-scale
correction structure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:36:07 +01:00
fa21c01afc Fix: compensate cfg amplification + smooth tanh switching
Two remaining noise sources fixed:

1. CFG amplification: the return value is uncond + cond_scale * u_sw * sigma,
   so the noise-space correction is cond_scale * K per element. At cfg=12
   with K=0.2, that's 2.4 — far too large. Fix: K_eff = K / cond_scale,
   making the effective correction just K regardless of cfg scale.

2. Hard switching: even clamp(s/phi) creates sharp transitions at the
   boundary. Replace with tanh(s/phi) for a fully smooth correction.
   phi = std(s) normalizes the sliding surface to its natural scale.

Net effect: the noise-space correction is now bounded by K=0.2 per element
regardless of cfg scale, and varies smoothly across spatial positions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:27:02 +01:00
a0d6bf39f7 Fix chattering: boundary layer SMC instead of hard sign()
The hard sign(s) creates a random +/-1 pattern in regions where the
guidance error is near zero. When amplified by cond_scale (e.g. 12),
this produces severe high-frequency noise artifacts, especially on
SDXL which has smaller guidance magnitudes than SD3/FLUX.

Replace sign(s) with sat(s/phi) — the standard boundary layer approach
in practical sliding mode control. phi adapts to the guidance error
std so the switching threshold is meaningful across different models.

- |s| >> phi: correction = +/-K (same as paper)
- |s| << phi: correction = K*s/phi (smooth, proportional)
- Near-zero guidance regions get near-zero correction (no noise)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 18:00:57 +01:00
5f0bd6825d Fix: normalize to noise-prediction space before SMC
The previous fix (denoised space) still had the problem: K * cond_scale
produced a constant ±2.4 perturbation per element at cfg=12, destroying
the image at every step.

The paper's K=0.2 is calibrated for unit-variance noise predictions.
ComfyUI's cond/uncond are sigma-scaled (x - denoised ≈ sigma * epsilon).
Now we divide by sigma to recover epsilon-space, apply SMC there, then
multiply back by sigma. This gives natural dampening at late steps:
- sigma=14 (early): correction ±33 in latent space (image is noise anyway)
- sigma=0.01 (late): correction ±0.024 in latent space (negligible)

This matches the paper's behavior where the scheduler conversion
inherently dampens the noise-space correction at low sigma values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:43:07 +01:00
612e7e973f Fix sigma-scaling bug causing noisy images
ComfyUI's args["cond"]/["uncond"] are (x - denoised), which are
sigma-scaled. At late denoising steps (sigma~0.01), the fixed K=0.2
correction was 200x the signal magnitude, destroying the image.

Fix: compute SMC in denoised space using args["cond_denoised"] and
args["uncond_denoised"], which have consistent magnitude across all
sigma values — matching the paper's noise-prediction space.

Also fixes first-step behavior to match the original paper (SMC
correction applied from step 0, not step 1).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:33:05 +01:00
a79c5163a1 Initial implementation of SMC-CFG Ctrl ComfyUI node
Implements the Sliding Mode Control CFG algorithm from the paper
"CFG-Ctrl: A Control-Theoretic Perspective on Classifier-Free Guidance" (CVPR 2026)
as a ComfyUI model patch node.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 17:10:07 +01:00