# SxCP Eval Loop

This loop is for tuning the SxCP generator toward stronger Krea2 images.
ComfyUI sends a generated prompt, image, and seed to Codex, Codex analyzes the
result, then sends back exactly one edited prompt for the next A/B test.
Confirmed findings become either generator changes or durable prompt rules in
[`krea2-prompt-guide.md`](krea2-prompt-guide.md).

## Channels

- `sxcp_eval_in`: ComfyUI to Codex. Contains the prompt text, image path, and
  seed.
- `sxcp_eval_out`: Codex to ComfyUI. Prompt-only text plus the same seed through
  the MCP signal when supported. Do not put analysis here.
- `sxcp_eval_log`: optional analysis/log channel.

## Manual Loop

Start the helper after sending a test prompt:

```bash
tools/sxcp_eval_loop.sh 3
```

Every three minutes it prints a structured request asking Codex to:

1. Pull `sxcp_eval_in`.
2. Record the emitted seed.
3. Inspect the image.
4. Compare it to the prompt and previous edit.
5. Push one prompt-only edit to `sxcp_eval_out`, preserving the same seed through
   the MCP signal when available.
6. Classify the finding as prompt-only, prompt-guide rule, or generator fix.
7. Change generator code/data only when the issue is systemic.
8. Record the finding and update the Krea2 prompt guide when a rule is confirmed.

Runtime logs are written under `.sxcp_eval/` and ignored by git.

Durable fixed-seed findings that justify a guide rule, generator patch, or pose
variant promotion are recorded in [`krea2-eval-log.json`](krea2-eval-log.json).
Use runtime logs for scratch notes; use the JSON log only for evidence that
should remain tied to a catalog variant. Image paths in that log point at
external ComfyUI artifacts and may be cleaned; the durable evidence is the fixed
seed, prompt summaries, observation, decision, and commit.

Record durable findings with the checked helper instead of hand-editing the log:

```bash
python tools/krea2_record_eval.py --print-template --variant-key pov_footjob_frontal_sole_stroke --seed 1234 > /tmp/krea2-entry.json
python tools/krea2_record_eval.py --entry-json /tmp/krea2-entry.json --dry-run
python tools/krea2_record_eval.py --entry-json /tmp/krea2-entry.json
```

Entry template:

```json
{
  "id": "variant-seed-short-finding",
  "date": "2026-06-29",
  "variant_key": "pov_example_variant",
  "seed": 1234,
  "source": "sxcp_eval_mcp",
  "result": "accepted",
  "decision": "generator_patch",
  "baseline_prompt_summary": "What the generated prompt did before the edit.",
  "candidate_prompt_summary": "What the edited prompt changed for the same seed.",
  "observation": "What the image comparison proved and why it matters for the generator or guide.",
  "baseline_image": "/absolute/path/to/baseline.png",
  "candidate_image": "/absolute/path/to/candidate.png",
  "commit": "pending"
}
```

To see catalog coverage and the next variants that still need controlled
testing, run:

```bash
python tools/krea2_tuning_report.py
```

The report includes atlas references plus prompt cues and avoid cues for the
next fixed-seed test candidate. It also shows the latest durable evidence for
variants that already have fixed-seed results, including the evidence id, seed,
decision, candidate prompt summary, and observation. For each normal next-test
candidate, it prints a `krea2_record_eval.py --print-template` command; replace
`<fixed_seed>` with the seed from the run you are recording.

## Optional Command Hook

If you have a one-shot Codex command you want to run automatically, set:

```bash
SXCP_EVAL_CODEX_CMD="codex exec" tools/sxcp_eval_loop.sh 3
```

The request is sent on stdin. The command also receives:

- `SXCP_EVAL_IN_CHANNEL`
- `SXCP_EVAL_OUT_CHANNEL`
- `SXCP_EVAL_LOG_CHANNEL`
- `SXCP_EVAL_GUIDE_FILE`
- `SXCP_EVAL_REQUEST_FILE`
- `SXCP_EVAL_CYCLE_DIR`
- `SXCP_EVAL_CYCLE`

## Evaluation Axes

- Identity consistency
- Outfit continuity
- Pose/action accuracy
- Camera compliance
- Location coherence
- Crop/framing
- Prompt noise/repetition
- Model confusion tokens
- Seed control/reproducibility
- Overall Krea2 image usefulness

## POV Pose Atlas

Use `/media/unraid/davinci/Qwen_edit_lora/POV/dataset_v2` as the local
reference atlas for POV pose geometry. The top-level pose folders contain real
POV examples, and matching `_control` folders contain solo/control versions.
Ignore `bg` and `*_bg` folders for pose rules; they are background plates
without people. Treat the pose image folders as the primary source for body
geometry; captions are optional and are not present for every folder.

Suggested workflow:

1. Choose one pose family, for example `doggy`, `doggy_alt`, `cowgirl`, or
   `missionary`.
2. Sample 5-10 real pose images and their control images.
3. Write the repeated geometry as a compact prompt rule.
4. Run one fixed-seed Krea2 prompt using that rule.
5. Repeat on a second seed or character before changing generator defaults.
6. If the prompt itself is structurally contradictory before rendering, patch
   immediately and add a regression test.

For POV doggy, the atlas shows that visible viewer thighs, lower torso, or
pelvis can be correct. Do not treat them as automatic failures.

## Seed Contract

The seed is transport metadata, not prompt text. When the graph emits a seed, an
A/B wording test should reuse that exact seed so the image difference mostly
comes from wording, not sampling randomness. If a payload has no seed, mark that
cycle as uncontrolled and avoid turning the result into a durable generator rule
without another controlled run.

## Generator Fix Rule

Only edit the generator when the image shows a repeatable, systemic prompt
failure. Examples:

- Selfie wording overrides orbit camera.
- Clothing continuity loses the selected softcore outfit.
- POV wording makes the off-camera participant the visual subject.
- Location camera layout inserts foreground anchors in the wrong place.

For one-off model drift, send a cleaner prompt to `sxcp_eval_out` and keep the
generator unchanged. For repeated prompt behavior, update the generator and add
the rule to `docs/krea2-prompt-guide.md`.