Correct 4B 'partial' bias on identical values; harden verdict rule; note model-capability limits

The 4B over-uses 'partial' (mislabels identical ref/gen and clear opposites) and also mis-identifies fine-grained content (e.g. names a position 'doggy'/'cowgirl' when it is neither). Deterministic fix: force verdict=match when normalized ref==gen. Prompt hardened to not default to 'partial' (opposites=mismatch). Docs: the 4B is only reliable for coarse attributes — use the 30B for fine-grained recognition; prefer grounded geometry axes over named-position labels. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 23:43:34 +02:00
parent 69c1d6deb4
commit e4dfaac63b
2 changed files with 39 additions and 9 deletions
@@ -17,11 +17,22 @@ the agent needs three things:
 | `verdict` | `match` / `partial` / `mismatch` | which axes to fix first (mismatch → partial → match) |

 That's the whole signal: *target, current, distance*. The agent corrects by rewriting the
-prompt so `gen → ref` on the **mismatch** (then `partial`) axes. The judge returns
-`{"verdict", "ref", "gen"}` per axis. A discrete verdict is used because small VLMs give
-**unreliable 0–1 scores** (identical ref/gen often scored 0.6) but classify match/partial/
-mismatch reliably. `overall_score` and `mismatch_count` are computed **from the verdicts on
-our side** (mean ordinal), so they're monotonic and trustworthy as a stop signal.
+prompt so `gen → ref` on the axes that differ.
+
+**Model capability is the critical path.** Garbage descriptions in → garbage calibration
+out. The **4B is too weak for fine-grained NSFW recognition**: it mislabels the verdict
+(central-tendency bias toward `partial`) AND mis-identifies content — it will confidently
+call a position "doggy" or "cowgirl" when it is neither. It's only reliable for *coarse*
+attributes (subject count, nude/clothed, photoreal vs anime, broad scene). For anything
+fine-grained — named positions, limb arrangement, gaze, hair detail — **use the 30B**
+(`model_path=30b-a3b`, `precision=nf4`). The node corrects the trivially-wrong verdicts
+(identical `ref`==`gen` → `match`), but it cannot fix a wrong *description*; only a more
+capable model can.
+
+**Prefer grounded geometry over named labels.** A named position (`position_name`) forces
+the model to classify into a vocabulary it gets wrong; observable geometry
+(`body_orientation`, `limb_arrangement`, `contact_points`, who faces where) is more
+grounded and survives a weaker model better. Weight those axes over the named label.

 The axes must **span what the prompt can express** — you can only fix what the prompt can
 say, and each diff must map to a lever. The default set (configurable on the node) is