Commit Graph

4 Commits

Author SHA1 Message Date
Ethanfel 06992506d7 Drop named-position axis for grounded geometry (30B still mis-names positions)
Even the 30B mis-identifies named sex positions (doggy/cowgirl) from images, so
position_name is removed. The pose cluster is now purely observable geometry:
body_orientation enriched with facing direction (who faces whom), plus
limb_arrangement / contact_points / pose. The agent composes any named label from
these reliable primitives. 23 default axes. Docs/examples updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 23:49:23 +02:00
Ethanfel 53f1f9b9b4 Switch compare to discrete verdicts + granular pose axes + per-axis definitions
The 4B's 0-1 scores were unreliable (identical ref/gen scored ~0.6), so the
judge now returns verdict match/partial/mismatch per axis; overall_score and a
new mismatch_count are computed from verdicts on our side (reliable, monotonic).
Expanded the action/pose cluster into position_name, body_orientation,
limb_arrangement, penetration, contact_points, genital_visibility (+ breast_size)
so explicit poses carry detail. Each axis now ships a one-line definition in the
prompt so gender_mix/subject_count stop absorbing positional text. 24 axes total.
Example workflows use the node default (axes=''). Docs realigned; stop condition
is now mismatch_count==0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 23:15:51 +02:00
Ethanfel 959ec70065 Redesign judge output for calibration: per-axis {score, ref, gen}, drop local fix suggestions
The local VLM now only observes and scores; correction is left to the stronger
external agent. Each axis reports the target value (ref), the current value (gen)
and the closeness (score) — the target/current/distance an agent needs to
calibrate. Expanded to ~20 granular axes (identity/body/wardrobe/action/affect/
camera/render) so the action cluster stays discriminative for explicit content.
swap_eval now inverts ref/gen of the swapped pass; diff summary sorts worst-first;
default max_new_tokens 1024. Docs aligned.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 22:52:40 +02:00
Ethanfel 95198a15b5 Initial commit: VLM-as-judge prompt calibration loop
Qwen3-VL image-similarity judge node, external-prompt receptor node,
agent_bridge CLI, example SDXL workflow, and methodology/agent-loop/
calibration-policy docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 22:15:56 +02:00