describe emits one canonical reference; compare can anchor on it
Describe mode now produces a single coherent, internally-consistent canonical scene description (paragraph + per-axis spec, written to canonical_reference in the report). Compare gains an optional reference_description input: when set, it anchors on that fixed text and shows only the generated image (no swap) — so the reference side never drifts or self-contradicts across iterations; only the generated image is re-described each turn. agent_bridge gains --ref-desc / --ref-desc-file (reads the describe report's canonical_reference). Docs + example workflow updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -47,18 +47,21 @@ pose cluster is split into many axes so the agent gets specific, actionable targ
|
||||
## Step 0 — first pass (describe / bootstrap)
|
||||
|
||||
The very first iteration has no generated image yet, so the judge runs in **describe
|
||||
mode**: it looks at the reference alone and returns a prompt-ready `caption` plus a
|
||||
per-axis target spec. That seeds everything:
|
||||
mode**: it looks at the reference alone and emits **one canonical scene description** —
|
||||
a coherent, internally-consistent paragraph plus a per-axis target spec. That seeds
|
||||
everything *and* becomes the fixed reference for the whole loop:
|
||||
|
||||
```bash
|
||||
python agent_bridge.py --mode describe --workflow workflow/workflow_describe_api.json \
|
||||
--run-tag seed --analysis-dir <report_dir>
|
||||
```
|
||||
→ `latest.json` = `{"mode":"describe", "caption":"...", "axes":{axis: "value", ...}}`
|
||||
→ `calib_seed.json` = `{"mode":"describe", "description":"…", "axes":{axis:value,…}, "canonical_reference":"…"}`
|
||||
|
||||
The agent takes `caption` as the **initial prompt** and `axes` as the **initial
|
||||
axis_state**, then enters the compare loop below. No reference description has to be
|
||||
written by hand — the VL provides the target to reproduce.
|
||||
The agent takes `description` as the **initial prompt** and `axes` as the **initial
|
||||
axis_state**. Crucially, the compare loop then **anchors on this canonical reference**
|
||||
(via `--ref-desc-file`) instead of re-reading the reference image every iteration — so the
|
||||
`ref` side never drifts or contradicts itself across passes; only the generated image is
|
||||
re-described each turn.
|
||||
|
||||
## Per-iteration algorithm (greedy per-axis hill-climb)
|
||||
|
||||
@@ -69,6 +72,7 @@ loop:
|
||||
prompt = render(state) # state = current value per axis
|
||||
report = run agent_bridge.py --prompt prompt --negative state.negative
|
||||
--seed state.seed --run-tag iter{i}
|
||||
--ref-desc-file <report_dir>/calib_seed.json # anchor on canonical ref
|
||||
--workflow wf.json --analysis-dir <report_dir>
|
||||
if report.mismatch_count == 0 and report.overall_score >= TARGET:
|
||||
stop("converged", state) # TARGET e.g. 0.9 (mostly match)
|
||||
|
||||
Reference in New Issue
Block a user