Handle reasoning models (Qwen3.5/3.6): no-think + JSON-only + prose fallback
Qwen3.5/3.6 are reasoning models — they 'think out loud' in markdown and never
reach the JSON, then get cut off at the token limit -> '(no parseable judgement)'.
Fixes: apply_chat_template(enable_thinking=False) + strip <think> blocks; hardened
'output ONLY JSON, do not think out loud' instruction; default max_new_tokens
1024->2048 (max 8192); and a markdown fallback parser (_parse_markdown_verdicts /
_parse_axes) that extracts per-axis {verdict,ref,gen} from the prose the model
reliably emits. describe falls back to using the raw text as the caption.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -38,7 +38,7 @@ can act on it.
|
||||
| `precision` | bf16 / fp8 / nf4 | bf16 | **the quant** — applies to the selected model (VRAM table below) |
|
||||
| `model_path` | STRING | "" (empty) | **manual override** of the dropdown — local dir, HF repo id, or alias (`8b`/`30b-a3b`/`3.5-9b`/`3.6-27b`/`3.6-35b`). Empty = use `model_select` |
|
||||
| `axes` | STRING | "" (empty) | **override** the profile's axis set with a custom comma/newline list; empty = use `profile` |
|
||||
| `max_new_tokens` | INT | 1024 | |
|
||||
| `max_new_tokens` | INT | 2048 | raise it if a reasoning model (Qwen3.5/3.6) gets cut off before finishing |
|
||||
| `temperature` | FLOAT | 0.0 | 0 = greedy/repeatable |
|
||||
| `swap_eval` | BOOL | true | run twice with images swapped, average → cuts position bias |
|
||||
| `keep_loaded` | BOOL | true | cache weights across loop iterations |
|
||||
|
||||
Reference in New Issue
Block a user