Fix: compensate cfg amplification + smooth tanh switching
Two remaining noise sources fixed: 1. CFG amplification: the return value is uncond + cond_scale * u_sw * sigma, so the noise-space correction is cond_scale * K per element. At cfg=12 with K=0.2, that's 2.4 — far too large. Fix: K_eff = K / cond_scale, making the effective correction just K regardless of cfg scale. 2. Hard switching: even clamp(s/phi) creates sharp transitions at the boundary. Replace with tanh(s/phi) for a fully smooth correction. phi = std(s) normalizes the sliding surface to its natural scale. Net effect: the noise-space correction is now bounded by K=0.2 per element regardless of cfg scale, and varies smoothly across spatial positions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
22
nodes.py
22
nodes.py
@@ -86,14 +86,20 @@ class SMCCFGCtrl:
|
|||||||
# Sliding surface: s_t = (e_t - e_{t-1}) + lambda * e_{t-1}
|
# Sliding surface: s_t = (e_t - e_{t-1}) + lambda * e_{t-1}
|
||||||
s = (guidance_eps - prev_eps) + lam * prev_eps
|
s = (guidance_eps - prev_eps) + lam * prev_eps
|
||||||
|
|
||||||
# Boundary layer SMC: replaces hard sign(s) with sat(s/phi).
|
# Compensate for CFG amplification: the return value multiplies
|
||||||
# Hard sign() creates random ±1 in regions where |s| is near zero,
|
# u_sw by cond_scale, so the effective noise-space correction is
|
||||||
# which cond_scale amplifies into visible noise. The boundary layer
|
# cond_scale * K_eff. We want this to equal K (independent of cfg),
|
||||||
# uses a linear transition near zero (standard chattering prevention
|
# so K_eff = K / cond_scale. Without this, cfg=12 with K=0.2 gives
|
||||||
# in practical SMC). phi adapts to the guidance magnitude so K stays
|
# a correction of 2.4 per element — far too large.
|
||||||
# meaningful across models with different guidance scales.
|
K_eff = K / max(cond_scale, 1.0)
|
||||||
phi = guidance_eps.std().clamp(min=1e-6)
|
|
||||||
u_sw = -K * (s / phi).clamp(-1.0, 1.0)
|
# Smooth switching via tanh(s/phi) instead of hard sign(s).
|
||||||
|
# sign() quantizes every element to ±1, creating a salt-and-pepper
|
||||||
|
# pattern that's visible as high-frequency noise. tanh provides
|
||||||
|
# a smooth transition: proportional near zero, saturating at ±1.
|
||||||
|
# phi normalizes s so the transition happens at the right scale.
|
||||||
|
phi = s.std().clamp(min=1e-6)
|
||||||
|
u_sw = -K_eff * torch.tanh(s / phi)
|
||||||
|
|
||||||
# Corrected guidance error (in normalized noise space)
|
# Corrected guidance error (in normalized noise space)
|
||||||
guidance_eps = guidance_eps + u_sw
|
guidance_eps = guidance_eps + u_sw
|
||||||
|
|||||||
Reference in New Issue
Block a user