Full K corrupts at high CFG (output correction = cfg*K = 2.4 at cfg=12). K/cfg was too weak (0.2 at cfg=12). The paper only tested up to cfg=7.5 where output corrections range 0.5-1.5. K/sqrt(cfg) keeps output correction = sqrt(cfg)*K growing sub-linearly, giving 0.69 at cfg=12 — within the paper's working range. Also store raw (pre-correction) guidance as prev_eps to prevent correction accumulation through the sliding surface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4.2 KiB
4.2 KiB