Add describe (first-pass) mode to the judge node
New mode on QwenVLImageJudge: 'describe' looks at the reference alone and returns a prompt-ready caption + per-axis target spec to seed the very first prompt (the generator has nothing to reproduce yet). 'compare' is the existing ref-vs-gen scoring. generated_image is now optional (required only for compare); shared generation refactored into _generate_from_messages; third output renamed diff_analysis -> analysis (mode-agnostic). agent_bridge gains --mode (describe needs no receptor/prompt); added workflow_describe_api.json. Docs updated with the first-pass bootstrap step. Fixed error-return arity to 5-tuple. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -38,6 +38,22 @@ grouped below.
|
||||
Coarse axes blur the differences that matter for adult imagery; this set keeps the act /
|
||||
interaction cluster granular so the agent gets actionable targets.
|
||||
|
||||
## Step 0 — first pass (describe / bootstrap)
|
||||
|
||||
The very first iteration has no generated image yet, so the judge runs in **describe
|
||||
mode**: it looks at the reference alone and returns a prompt-ready `caption` plus a
|
||||
per-axis target spec. That seeds everything:
|
||||
|
||||
```bash
|
||||
python agent_bridge.py --mode describe --workflow workflow/workflow_describe_api.json \
|
||||
--run-tag seed --analysis-dir <report_dir>
|
||||
```
|
||||
→ `latest.json` = `{"mode":"describe", "caption":"...", "axes":{axis: "value", ...}}`
|
||||
|
||||
The agent takes `caption` as the **initial prompt** and `axes` as the **initial
|
||||
axis_state**, then enters the compare loop below. No reference description has to be
|
||||
written by hand — the VL provides the target to reproduce.
|
||||
|
||||
## Per-iteration algorithm (greedy per-axis hill-climb)
|
||||
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user