Add chat mode: use the node as a general VLM, not just a judge
New mode='chat' with system_prompt + user_prompt inputs runs your own prompt over the image(s) and returns raw text in 'analysis' — reusing the same model dropdown, quant, auto-download and backend. Makes it a one-node abliterated VLM for captioning, tagging, Q&A, prompt-from-image, etc. agent_bridge gains --mode chat / --system-prompt / --user-prompt (no receptor needed). Writes a chat report (latest.json) for the agent. Docs updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -31,7 +31,7 @@ can act on it.
|
||||
| name | type | default | notes |
|
||||
|---|---|---|---|
|
||||
| `reference_image` | IMAGE | — | the target |
|
||||
| `mode` | compare / describe | compare | `describe` = first pass over the reference only → caption + target spec (seeds the prompt). `compare` = score ref vs generated |
|
||||
| `mode` | compare / describe / chat | compare | `compare` = score ref vs generated. `describe` = first pass over the reference → caption + target spec. `chat` = **general VLM**: your `system_prompt` + `user_prompt` over the image(s) → raw text |
|
||||
| `profile` | general / oral / penetration / handjob / solo | general | **analysis profile** — act-specialized axis set; the act-critical axes are distance/proximity-aware (e.g. `mouth_genital_distance`) so magnitude isn't hidden behind a coarse label |
|
||||
| `generated_image` | IMAGE (optional) | — | the candidate to score (required for `compare`, ignored for `describe`) |
|
||||
| `model_select` | dropdown (model name) | 4B local | **which judge** (transformers/safetensors, auto-downloaded): Qwen3-VL 4B/8B/30B-A3B, **Qwen3.5-9B**, **Qwen3.6-27B/35B-A3B** (newer, natively multimodal). Param size shown in the label |
|
||||
@@ -43,19 +43,27 @@ can act on it.
|
||||
| `swap_eval` | BOOL | true | run twice with images swapped, average → cuts position bias |
|
||||
| `keep_loaded` | BOOL | true | cache weights across loop iterations |
|
||||
| `auto_download` | BOOL | true | if `model_path` is a repo id/alias and not local, fetch it from HF into `models/prompt_generator/` |
|
||||
| `system_prompt` | STRING | "" | **chat mode**: your system prompt |
|
||||
| `user_prompt` | STRING | "Describe this image." | **chat mode**: your instruction over the image(s) |
|
||||
|
||||
**Auto-download:** set `model_path` to `30b-a3b` (alias) or any `org/name` repo id and leave
|
||||
`auto_download` on — the node snapshot-downloads it on first run (into ComfyUI's
|
||||
`models/prompt_generator/<name>`) and reuses the local copy afterward. Local paths and the
|
||||
default skip download entirely.
|
||||
|
||||
**General VLM (chat mode):** set `mode=chat` and the node becomes a plain vision-language
|
||||
node — feed an image (and optionally a second), write your own `system_prompt`/`user_prompt`,
|
||||
and read the model's text from the `analysis` output. Reuses the same model dropdown, quant,
|
||||
and auto-download as the judge, so it's a one-node abliterated VLM for captioning, tagging,
|
||||
Q&A, prompt-from-image, etc. (CLI: `agent_bridge.py --mode chat --user-prompt "..."`).
|
||||
|
||||
**Outputs**
|
||||
|
||||
| name | type | use |
|
||||
|---|---|---|
|
||||
| `overall_score` | FLOAT 0..1 | compare: mean verdict (computed here, not by the model). describe: `1.0` placeholder |
|
||||
| `axis_scores_json` | STRING (JSON) | compare: per-axis `{verdict, ref, gen}` (verdict = match/partial/mismatch). describe: `{axis: value}` |
|
||||
| `analysis` | STRING | compare: header (`overall, N mismatches`) + axes worst-first (`VERDICT ref:[…] gen:[…]`). describe: the `caption` |
|
||||
| `analysis` | STRING | compare: header (`overall, N mismatches`) + axes worst-first (`VERDICT ref:[…] gen:[…]`). describe: the `caption`. chat: the model's response |
|
||||
| `raw` | STRING | raw model output (both passes if `swap_eval`) |
|
||||
| `report_path` | STRING | path to the written `calib_<tag>.json` (carries `mismatch_count`) |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user