Document seed-controlled Krea2 evals
This commit is contained in:
@@ -26,6 +26,23 @@ Avoid letting two sections describe incompatible camera or framing intents.
|
|||||||
- Analysis, scoring, and generator notes belong in chat or `sxcp_eval_log`.
|
- Analysis, scoring, and generator notes belong in chat or `sxcp_eval_log`.
|
||||||
- Keep one experiment variable per cycle when possible.
|
- Keep one experiment variable per cycle when possible.
|
||||||
- Lock seed, character, location, and camera when testing wording changes.
|
- Lock seed, character, location, and camera when testing wording changes.
|
||||||
|
- Treat the MCP seed as transport metadata. Preserve it for prompt-only A/B tests
|
||||||
|
and do not write it into the visible prompt text.
|
||||||
|
|
||||||
|
## Seed-Controlled A/B Tests
|
||||||
|
|
||||||
|
Use one fixed seed when deciding whether prompt wording helped Krea2. A single
|
||||||
|
image can justify a prompt-only retry when the mismatch is obvious, but a
|
||||||
|
generator rule needs either repeated evidence or a generated prompt that is
|
||||||
|
structurally wrong before rendering.
|
||||||
|
|
||||||
|
When reviewing an eval payload, log:
|
||||||
|
|
||||||
|
- emitted seed,
|
||||||
|
- original generated prompt,
|
||||||
|
- edited prompt,
|
||||||
|
- image failure or improvement,
|
||||||
|
- whether the change should stay prompt-only or become a generator patch.
|
||||||
|
|
||||||
## Camera And Composition
|
## Camera And Composition
|
||||||
|
|
||||||
|
|||||||
+24
-11
@@ -1,15 +1,17 @@
|
|||||||
# SxCP Eval Loop
|
# SxCP Eval Loop
|
||||||
|
|
||||||
This loop is for tuning the SxCP generator toward stronger Krea2 images.
|
This loop is for tuning the SxCP generator toward stronger Krea2 images.
|
||||||
ComfyUI sends a generated prompt and image to Codex, Codex analyzes the result,
|
ComfyUI sends a generated prompt, image, and seed to Codex, Codex analyzes the
|
||||||
then sends back exactly one edited prompt for the next A/B test. Confirmed
|
result, then sends back exactly one edited prompt for the next A/B test.
|
||||||
findings become either generator changes or durable prompt rules in
|
Confirmed findings become either generator changes or durable prompt rules in
|
||||||
[`krea2-prompt-guide.md`](krea2-prompt-guide.md).
|
[`krea2-prompt-guide.md`](krea2-prompt-guide.md).
|
||||||
|
|
||||||
## Channels
|
## Channels
|
||||||
|
|
||||||
- `sxcp_eval_in`: ComfyUI to Codex. Contains the prompt text and image path.
|
- `sxcp_eval_in`: ComfyUI to Codex. Contains the prompt text, image path, and
|
||||||
- `sxcp_eval_out`: Codex to ComfyUI. Prompt-only. Do not put analysis here.
|
seed.
|
||||||
|
- `sxcp_eval_out`: Codex to ComfyUI. Prompt-only text plus the same seed through
|
||||||
|
the MCP signal when supported. Do not put analysis here.
|
||||||
- `sxcp_eval_log`: optional analysis/log channel.
|
- `sxcp_eval_log`: optional analysis/log channel.
|
||||||
|
|
||||||
## Manual Loop
|
## Manual Loop
|
||||||
@@ -23,12 +25,14 @@ tools/sxcp_eval_loop.sh 3
|
|||||||
Every three minutes it prints a structured request asking Codex to:
|
Every three minutes it prints a structured request asking Codex to:
|
||||||
|
|
||||||
1. Pull `sxcp_eval_in`.
|
1. Pull `sxcp_eval_in`.
|
||||||
2. Inspect the image.
|
2. Record the emitted seed.
|
||||||
3. Compare it to the prompt and previous edit.
|
3. Inspect the image.
|
||||||
4. Push one prompt-only edit to `sxcp_eval_out`.
|
4. Compare it to the prompt and previous edit.
|
||||||
5. Classify the finding as prompt-only, prompt-guide rule, or generator fix.
|
5. Push one prompt-only edit to `sxcp_eval_out`, preserving the same seed through
|
||||||
6. Change generator code/data only when the issue is systemic.
|
the MCP signal when available.
|
||||||
7. Record the finding and update the Krea2 prompt guide when a rule is confirmed.
|
6. Classify the finding as prompt-only, prompt-guide rule, or generator fix.
|
||||||
|
7. Change generator code/data only when the issue is systemic.
|
||||||
|
8. Record the finding and update the Krea2 prompt guide when a rule is confirmed.
|
||||||
|
|
||||||
Runtime logs are written under `.sxcp_eval/` and ignored by git.
|
Runtime logs are written under `.sxcp_eval/` and ignored by git.
|
||||||
|
|
||||||
@@ -60,8 +64,17 @@ The request is sent on stdin. The command also receives:
|
|||||||
- Crop/framing
|
- Crop/framing
|
||||||
- Prompt noise/repetition
|
- Prompt noise/repetition
|
||||||
- Model confusion tokens
|
- Model confusion tokens
|
||||||
|
- Seed control/reproducibility
|
||||||
- Overall Krea2 image usefulness
|
- Overall Krea2 image usefulness
|
||||||
|
|
||||||
|
## Seed Contract
|
||||||
|
|
||||||
|
The seed is transport metadata, not prompt text. When the graph emits a seed, an
|
||||||
|
A/B wording test should reuse that exact seed so the image difference mostly
|
||||||
|
comes from wording, not sampling randomness. If a payload has no seed, mark that
|
||||||
|
cycle as uncontrolled and avoid turning the result into a durable generator rule
|
||||||
|
without another controlled run.
|
||||||
|
|
||||||
## Generator Fix Rule
|
## Generator Fix Rule
|
||||||
|
|
||||||
Only edit the generator when the image shows a repeatable, systemic prompt
|
Only edit the generator when the image shows a repeatable, systemic prompt
|
||||||
|
|||||||
+19
-15
@@ -135,14 +135,17 @@ style. Every cycle should turn visual evidence into one of:
|
|||||||
## Protocol
|
## Protocol
|
||||||
|
|
||||||
1. Pull the latest prompt/image from \`$in_channel\`.
|
1. Pull the latest prompt/image from \`$in_channel\`.
|
||||||
2. Compare the image against the prompt and previous edited prompt.
|
2. Record the emitted seed. If it is missing, mark the image as uncontrolled.
|
||||||
3. Identify concrete Krea2 mismatches and likely generator path.
|
3. Compare the image against the prompt and previous edited prompt.
|
||||||
4. Classify the next step: prompt-only edit, guide rule, or generator patch.
|
4. Identify concrete Krea2 mismatches and likely generator path.
|
||||||
5. Push only the next test prompt to \`$out_channel\`.
|
5. Classify the next step: prompt-only edit, guide rule, or generator patch.
|
||||||
6. Keep analysis in chat or \`$log_channel\`, not in \`$out_channel\`.
|
6. Push only the next test prompt text to \`$out_channel\`. Preserve the same
|
||||||
7. Edit generator code/data only when the issue is systemic.
|
seed through the MCP signal when available; never write the seed into the
|
||||||
8. Update \`$guide_file\` when a wording rule is confirmed.
|
prompt text.
|
||||||
9. Run focused smoke tests after generator edits.
|
7. Keep analysis in chat or \`$log_channel\`, not in \`$out_channel\`.
|
||||||
|
8. Edit generator code/data only when the issue is systemic.
|
||||||
|
9. Update \`$guide_file\` when a wording rule is confirmed.
|
||||||
|
10. Run focused smoke tests after generator edits.
|
||||||
|
|
||||||
## Cycles
|
## Cycles
|
||||||
|
|
||||||
@@ -175,15 +178,16 @@ Channels:
|
|||||||
|
|
||||||
Evaluation steps:
|
Evaluation steps:
|
||||||
1. Pull the latest payload from $in_channel.
|
1. Pull the latest payload from $in_channel.
|
||||||
2. Inspect image_path and compare it to the prompt text.
|
2. Record payload.seed if present. Keep the same seed for prompt-only A/B tests.
|
||||||
3. Score these Krea2 axes: identity, outfit continuity, pose/action, camera compliance, location coherence, crop/framing, prompt noise, model confusion tokens, and overall image usefulness.
|
3. Inspect image_path and compare it to the prompt text.
|
||||||
4. Identify the smallest concrete mismatch that should be tested next.
|
4. Score these Krea2 axes: identity, outfit continuity, pose/action, camera compliance, location coherence, crop/framing, prompt noise, model confusion tokens, seed control, and overall image usefulness.
|
||||||
5. Classify the finding:
|
5. Identify the smallest concrete mismatch that should be tested next.
|
||||||
- prompt-only: push exactly one edited prompt to $out_channel and nothing else on that channel.
|
6. Classify the finding:
|
||||||
|
- prompt-only: push exactly one edited prompt to $out_channel and preserve payload.seed through the MCP signal when the tool supports it.
|
||||||
- guide-rule: update $guide_file with the confirmed Krea2 wording rule.
|
- guide-rule: update $guide_file with the confirmed Krea2 wording rule.
|
||||||
- generator-fix: edit the responsible generator path, add/adjust focused smoke coverage, run tests, and summarize the change.
|
- generator-fix: edit the responsible generator path, add/adjust focused smoke coverage, run tests, and summarize the change.
|
||||||
6. Keep a clear link between the image evidence, the prompt wording, and the generator path.
|
7. Keep a clear link between the image evidence, seed, prompt wording, and generator path.
|
||||||
7. Append the finding to the eval log with: original issue, changed wording/path, expected improvement, test result, guide update, generator update, and next hypothesis.
|
8. Append the finding to the eval log with: seed, original issue, changed wording/path, expected improvement, test result, guide update, generator update, and next hypothesis.
|
||||||
|
|
||||||
Current run:
|
Current run:
|
||||||
- run_id: $run_id
|
- run_id: $run_id
|
||||||
|
|||||||
Reference in New Issue
Block a user