Redesign judge output for calibration: per-axis {score, ref, gen}, drop local fix suggestions
The local VLM now only observes and scores; correction is left to the stronger external agent. Each axis reports the target value (ref), the current value (gen) and the closeness (score) — the target/current/distance an agent needs to calibrate. Expanded to ~20 granular axes (identity/body/wardrobe/action/affect/ camera/render) so the action cluster stays discriminative for explicit content. swap_eval now inverts ref/gen of the swapped pass; diff summary sorts worst-first; default max_new_tokens 1024. Docs aligned. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -34,7 +34,7 @@ can act on it.
|
||||
| `generated_image` | IMAGE | — | the candidate to score |
|
||||
| `model_path` | STRING | `/media/p5/qwen3vl_4b_abliterated_comfy_convert/hf_bf16` | local dir, **HF repo id** (`org/name`), or alias (`30b-a3b` / `8b` / `4b`) |
|
||||
| `precision` | bf16 / fp16 / fp8 / nf4 | bf16 | `nf4` = 4-bit (run the 30B judge on 32 GB); `fp8` with the `hf_fp8` copy |
|
||||
| `axes` | STRING | cast, clothing, pose, scene, composition, expression, color_light | scored axes (match your Prompt-Builder knobs) |
|
||||
| `axes` | STRING | ~20 axes (identity, body, wardrobe, action, affect, camera, render) | scored axes; granular for explicit content. Edit to taste |
|
||||
| `max_new_tokens` | INT | 512 | |
|
||||
| `temperature` | FLOAT | 0.0 | 0 = greedy/repeatable |
|
||||
| `swap_eval` | BOOL | true | run twice with images swapped, average → cuts position bias |
|
||||
@@ -51,8 +51,8 @@ default skip download entirely.
|
||||
| name | type | use |
|
||||
|---|---|---|
|
||||
| `overall_score` | FLOAT 0..1 | loop stop-condition / objective |
|
||||
| `axis_scores_json` | STRING (JSON) | per-axis `{score, diff}` for the controller |
|
||||
| `diff_analysis` | STRING | human/controller-readable summary + fix suggestions |
|
||||
| `axis_scores_json` | STRING (JSON) | per-axis `{score, ref, gen}` — target vs current, for the agent |
|
||||
| `diff_analysis` | STRING | readable summary, worst axes first (`score ref:[…] gen:[…]`) |
|
||||
| `raw` | STRING | raw model output (both passes if `swap_eval`) |
|
||||
|
||||
## Install
|
||||
|
||||
Reference in New Issue
Block a user