ComfyUI-Ethanfel-Prompt-Bui…/docs/krea2-ab-methodology.md

# Krea2 A/B Methodology Memory

This file is the persistent memory for SxCP Krea2 prompt A/B methodology.
Update it whenever the testing method improves.

## Current Method

Version: `2026-07-01-top-view-shaft-anchor-calibration`

1. Pull or construct the baseline from an actual SxCP/CodexMCPTest source case.
2. Keep the sampler seed fixed across the baseline and candidate.
3. Keep subject, location family, camera family, and target pose fixed unless
   the experiment explicitly tests one of those axes.
4. Change one prompt variable at a time when possible, usually the visual
   hierarchy for the target contact or pose.
5. Keep `sxcp_eval_out` positive-only. Do not place negative-conditioning
   phrases in the visible prompt.
6. Use location-compatible anchors only. For coworking/office scenes, use chair
   edge, desk edge, laptop table, glass partitions, repeated desk rows, plants,
   and window depth instead of bedroom or bedding anchors.
7. Treat a manual prompt win as proof that Krea2 responds to the wording, not
   proof that the SxCP generator already emits it.
8. Mirror a prompt win into the generator as a provisional improvement when
   leaving a category if same-seed evidence shows it improves over baseline and
   the wording is generator-safe. Keep the route `candidate` until the broader
   generator-patch evidence matrix proves it.
9. When a subject-first batch preserves appearance but repeatedly misses the
   atlas body plane, record it as weak-case evidence and consider stronger
   control before adding more generator text.
10. Score spatial orientation against the atlas before accepting evidence,
    and treat a contradictory room/background read as a rejection even when
    contact or limb placement is clear. Use background cues to decide whether
    the viewer or partner is high, low, standing, seated, supine, or on a
    support before grading pose/contact quality.
11. For hard text-only pose families, set an exploration budget before calling
    the route weak or deciding it needs stronger control. Eight prompt probes
    are only an early signal. Use batched wording-axis probes and aim for about
    fifty positive-only tries across meaningful axes before concluding that
    prompt text cannot reliably express the pose.
12. Do not require a perfect atlas hit before carrying progress forward. After
    the exploration budget, a repeatable partial that beats the baseline failure
    mode can become an accepted provisional generator improvement while the
    remaining miss stays documented for later seed/source expansion.
13. After patching generator wording, render one prompt produced by the actual
    code path before closing the category. Manual prompt-axis wins are not
    enough; the generated route can still drop the key contact hierarchy or add
    limiting positive-channel wording.
14. Manual prompt-axis exploration can stay hand-written while testing a wide
    range of new wording. Once a manual atlas prompt shows a promising axis,
    run a generator-reproduction checkpoint before treating it as route
    evidence: make the actual SxCP generator emit the closest equivalent final
    Krea prompt, compare that generated prompt against the manual winner, and
    record any missing hierarchy, wording order, restored detail, or
    scene/camera constraint. This avoids building a strong atlas prompt that the
    generator cannot reproduce.
15. Treat the final prompt that Krea receives and the rendered generator images
    as the source of truth. Tests are guardrails for known failures, not proof
    that the prompt will render the intended pose. If a test passes while the
    rendered image or prompt audit shows drift, update the test and the method
    rather than trusting the test.
16. Atlas/catalog prompt wording must use direct visual prompt sentences. Do
    not send option-list wording such as `or`, `may`, `optionally`, or `either`
    in pose cues, and do not append meta instructions such as `keep the visible
    partner...` or generic camera-layout prose to atlas routes. Krea2 is not an
    instruction-following LLM in this loop; prompt text should describe the
    image, not explain a policy.
17. Do not promote generator edits from a few isolated renders unless the final
    generated prompt is structurally wrong before rendering. For pose wording
    changes, collect prompt/image evidence across multiple women, source cases,
    and seeds when feasible, then patch only the repeated generator-safe
    hierarchy. Keep early wins as prompt-guide or provisional evidence.
18. Treat atlas prompt restore as a constrained final-prompt operation, not a
    free re-add of removed generator axes. Restored detail must support the
    atlas hierarchy without adding a second body/camera cue, hidden garment, or
    ambiguous subject. For clothing restore, keep softcore-continuity ownership
    explicit: use woman-owned wording such as `the woman wears ...` for visible
    clothes, keep partial-removal state when useful, and describe hidden lower
    garments as out of frame instead of naming visible shorts/pants that the
    atlas pose should not show. Strip raw `POV foreground clothing cue` or
    `POV foreground body cue` text from strict atlas prompts because it can make
    Krea2 assign clothing or body ownership to the viewer/man.
19. Use same-subject atlas refine decks before broad generator edits whenever
    possible. A deck such as
    `/media/unraid/comfyui/output/CodexMCP-Atlas-Refine` keeps the visible woman
    constant across atlas variants, so prompt/cue changes can be scored against
    pose ownership, workspace continuity, clothing visibility, and anatomy
    behavior without confusing subject drift for prompt behavior.
20. After the hard-pose exploration budget is met, separate repeatable partials
    from exact atlas hits. For `pov_blowjob_top_down_vertical_shaft`, 51
    text-only prompt/seed outcomes preserved contact and coworking continuity
    but repeatedly collapsed toward a forward/downward kneeling oral frame
    instead of the flatter atlas top view. Keep `mouth directly below the
    viewer's torso` plus `floor-plane-priority` as a provisional partial, and
    mark the exact flatter atlas family as needing stronger control/image
    guidance before more synonym-only prompt probes.
21. Before adding more target-pose words, do conflict analysis on clauses that
    fight the pose. A working prompt can still be dragged away by scene,
    clothing, foreground-body, camera-layout, or background-depth clauses that
    imply the wrong geometry. For overhead/top-view oral in a coworking lounge,
    the generic lounge tail with windows, repeated desk rows, and soft depth
    fights the atlas angle; rewrite the scene as sparse floor-plan evidence
    such as carpet texture and carpet tile seams before adding more `vertical`
    synonyms. The refinement loop is: keep the winning action/pose hierarchy
    fixed, remove or compress conflicting clauses, render, then only add back
    the smallest visible scene detail that still supports the target camera.
22. Do not generalize conflict analysis into a blanket “remove details” rule.
    Compare the atlas frame first and ask what it does better. If the atlas
    shows a room, support surface, furniture interaction, wall, or body-contact
    prop, add or rewrite those details. If the atlas shows a flat floor/ground
    plane, translate the scene into floor-plane evidence. For top-view oral, the
    working coworking translation is a minimal floor-plan tail such as
    `carpet texture, carpet tile seams`; adding desk rows and window depth
    fights the camera axis, while one cropped caster or desk foot should be
    tested only after the sparse floor-plane read is stable.
23. For `pov_blowjob_top_down_vertical_shaft`, floor-plane evidence alone is
    not the final rule. The atlas-22-style calibration needs the shaft/contact
    line as the first visual anchor: the centered shaft runs from the lower
    foreground to mouth contact, and the woman's face, eyelids, hair crown,
    shoulders, upper chest, neckline or bare upper torso, and one hand stack
    around that same axis. Viewer abdomen, thighs, and feet are lower-edge
    frame evidence after the shaft axis is established; they should not be the
    prompt's primary anchor. Translate body-proportion control into positive
    visibility hierarchy, for example `the centered shaft and mouth contact
    form the main vertical axis from the lower foreground to the woman's face`
    plus `the woman's face, hair crown, shoulders, upper chest, and one hand
    stack around the shaft-contact axis`. Avoid final-prompt phrasing such as
    `hips and ass stay visually secondary` or `mostly hidden`; that is useful
    as a human scoring note, but it is negative-style hierarchy inside positive
    conditioning.
24. Use large image-only atlas folders as cue-expansion pools, not as automatic
    generator truth. For top-view oral, the canonical curated atlas references
    live under
    `/media/unraid/davinci/Qwen_edit_lora/POV/dataset_v2/blowjob_top_view`,
    while
    `/media/unraid/davinci/Qwen_edit_lora/POV/dataset_v2/1.original/blowjob_top_view_1024`
    is a supplemental raw pool with the same family and more images. The larger
    pool is useful for defining repeated micro-axes: camera pitch, support-plane
    type, viewer foreground amount, partner upper-body stack, hand placement,
    eye direction, clothing/neckline anchors, and floor/furniture evidence. Keep
    the live catalog `reference_images` curated to a small stable set, and when a
    curated reference exists, prefer that canonical path in sidecar
    `reference_images`. Use the supplemental raw pool before authoring sidecar
    `append_cues`, especially for extra axes not present in the curated set. A
    cue should either repeat across several atlas images or be tied to a specific
    nearest reference image. Do not invent cue wording from a single mental model
    when the atlas folders can show the allowed variation directly.
25. When a manual calibration render succeeds after several failed top-view
    oral probes, compare the exact sidecar text before writing generator or
    sidecar memory. The 2026-07-01 manual renders
    `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00135_.png`
    through
    `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00139_.png`
    improved verticality by ordering the prompt as straight-down POV, shaft
    visibility, partner stack directly below the shaft, mouth contact, sparse
    floor plane. This is a word-order and anchor finding, not just a background
    removal finding. Future top-view oral variants should test shaft-first
    cue order against any abdomen-first or room-depth wording before patching
    the generator.
26. Choose sidecar prompt source by what the experiment is testing. Use
    `append_cues` for small micro-position alternates that can safely sit after
    the baseline prompt. Use exact `text` replacement when the winning evidence
    depends on word order, removes a conflicting baseline clause, or translates
    the scene tail into a different camera-compatible surface. For
    `pov_blowjob_top_down_vertical_shaft`, shaft-first calibration must be an
    exact-text candidate because appending shaft-axis cues after abdomen-first
    and deep coworking-room wording does not faithfully simulate the manual
    prompt win.

## Promotion Gates

- One clean fixed-seed A/B can be recorded as evidence for that source case.
- A prompt-guide rule needs repeated evidence across distinct subjects,
  locations, or seeds, unless the generated prompt is structurally wrong before
  rendering.
- A catalog variant remains candidate until the rule repeats under controlled
  conditions.
- A provisional generator patch is allowed when leaving a category if the best
  tested wording improves over baseline on a fixed seed. It should preserve the
  selected subject, outfit, location, and camera semantics, and it must not patch
  in a scene workaround that only solved one render.
- A proven/default generator patch still needs the broader evidence matrix below,
  unless the generated prompt is structurally wrong before rendering.

## Generator Mirroring

After a manual A/B prompt win, do not assume the SxCP generator mirrors the
wording. Add a failing regression against the final formatter output first, then
patch the narrow route boundary that owns the wording. The regression should
assert the accepted hierarchy terms and reject the failure mode that caused the
bad render, such as scene-incompatible anchors or negative-conditioning text in
the positive prompt.

After the route patch, run a generated-route probe through `sxcp_eval_out` with
the same sampler seed when feasible. Use the actual formatter output, not a
hand-normalized prompt. If the generated route regresses compared with the
manual prompt-axis winner, record the failed generated-route image as the
baseline, tighten the route wording, and validate again before logging the
candidate as generated-route evidence.

Generator-reproduction checkpoint:

1. Keep the manual winner as the reference prompt and summarize its required
   visual hierarchy.
2. Build the closest equivalent prompt through the normal SxCP nodes and Krea2
   formatter.
3. Diff the final generated prompt against the manual reference for missing
   pose hierarchy, changed ordering, restored details, scene anchors, camera
   framing, and option/meta wording.
4. If the generated prompt cannot reproduce the manual hierarchy, fix the
   generator path or record the gap before adding more image evidence.
5. Only count generated-route evidence after the final generated prompt
   preserves the manual winner's essential atlas hierarchy.

For location-specific wins, split the implementation:

- the action or role graph owns the pose/contact hierarchy;
- the final Krea formatter owns scene-compatible anchor expansion because it can
  see the selected scene, camera, and composition;
- existing route phrases that downstream tests rely on should be preserved
inside the stronger wording when they do not conflict with the A/B evidence.

## Atlas Detail Restore Hygiene

When re-enabling removed atlas prompt details such as clothing, face expression,
body touch, or camera-presentation text, audit the final Krea prompt before
judging the rendered image. The restore node should not undo the reason the
atlas route was strict in the first place.

- Restore details must be subordinate to the pose sentence. If a restored clause
  adds another POV foreground, body-owner, camera-layout, or optional/policy
  instruction, remove or rewrite it before rendering.
- Do not append raw category-axis detail into positive conditioning when a
  structured route already has a safer representation. Clothing restore should
  flow through the pair clothing continuity path, then through Krea cleanup, not
  through raw `clothing_detail`.
- For POV clothing, separate viewer body cues from partner clothing. The normal
  non-atlas foreground-clothing cue can help generic POV prompts, but strict
  atlas prompts already own the foreground through the atlas pose. Strip
  `POV foreground clothing cue` and `POV foreground body cue` from strict atlas
  final prompts.
- Visible partner clothing needs explicit subject ownership. Prefer
  `the woman wears ...` over bare fragments such as `button-down shirt ...`,
  because bare clothing fragments can be assigned to the viewer/man.
- If the atlas crop hides a lower garment, keep the partial-removal semantics
  without making the garment visible. For side-profile oral body-line, the useful
  wording is `The woman's lower garments are pulled aside out of frame; the
  woman wears the button-down shirt tied at the waist and a fitted bralette from
  the same outfit`.
- For side-profile oral body-line specifically, restored clothing must not steal
  the camera-owner body plane. The adult male viewer's abdomen/navel/pelvis/near
  thigh remain the foreground ownership cues; clothing is a partner detail only.

## Same-Subject Atlas Refine Decks

Use same-subject generated reference folders as controlled prompt decks before
building seedable atlas cue variants. The current first deck is:

```text
/media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

Build a manifest before analysis:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-manifest
```

Print coverage before choosing the next pose to test:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-coverage-report
```

The coverage report classifies each atlas entry as `needs_prompt_cleanup`,
`baseline_only`, `needs_visual_score`, `ready_for_seed_selection`,
`ready_for_catalog_review`, `rejected_only`, or `unknown_variant`.
`needs_prompt_cleanup` means the prompt-noise audit found option/meta/negative
wording in the baseline prompt or sidecar variants; clean that text before
scoring, authoring more alternates, or selecting cue seeds. `baseline_only`
means the prompt/image pair is clean but has no sidecar prompt variants yet; run
MCP probes or add reviewed sidecar candidates before seed selection.
`ready_for_catalog_review` means at least one seedable append-cue variant exists
and can be exported with `--print-catalog-cue-draft`.

Print prompt-noise findings before scoring or promoting a deck:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-prompt-noise-report
```

The prompt-noise report is read-only. It flags option-list words such as `or`,
`may`, `optionally`, and `either`, meta/policy fragments such as
`keep the visible partner`, `context stays`, `camera layout`, and leaked
`POV foreground ... cue` text, plus positive-channel negative-conditioning
phrases such as `no`, `without`, or `do not`. It also flags exact repeated
direct phrases because duplicated pose clauses often make Krea2 weight the wrong
axis. Treat findings as prompt cleanup tasks before fixed-seed evidence is
promoted. Do not auto-rewrite them into new cues; rewrite the source prompt or
sidecar text manually, then rebuild the manifest and rerun the audit. Coverage
carries the same issue counts, so a noisy entry stays `needs_prompt_cleanup`
even if it also has unscored or seedable sidecar variants.

For a file-oriented cleanup queue, print a prompt-cleanup sheet:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-prompt-cleanup-sheet
```

The cleanup sheet groups prompt-noise issues by editable source text. Baseline
issues point at the `.txt` prompt file; sidecar variant issues point at the
same-stem `.json` sidecar, including the `prompt_variant_id` and append-cue
index when relevant. The sheet preserves `current_text`,
`current_text_sha256`, `source_prompt_sha256`, issue excerpts, and blank
`replacement_text` / `cleanup_notes` fields for manual review. Do not apply the
sheet mechanically; use it as a checklist, edit the prompt or sidecar source by
hand, rebuild the manifest, and confirm coverage no longer reports
`needs_prompt_cleanup`.

When using the sheet as an apply artifact, fill `replacement_text` manually and
validate it first:

```bash
python tools/krea2_atlas_refine_manifest.py --validate-prompt-cleanup-sheet --prompt-cleanup-sheet-json /tmp/sxcp-prompt-cleanup-sheet-filled.json
```

Validation rejects blank replacements, replacements that still contain
option/meta/negative prompt noise, unsupported contexts, missing target
metadata, stale `current_text_sha256`, stale baseline `source_prompt_sha256`,
and replacements identical to the original text. After validation, apply only
against the source folder that produced the sheet:

```bash
python tools/krea2_atlas_refine_manifest.py --apply-prompt-cleanup-sheet --prompt-cleanup-sheet-json /tmp/sxcp-prompt-cleanup-sheet-filled.json --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

Apply is validation-first and drift-aware. It updates prompt files or the
targeted sidecar prompt-variant field, preserves unrelated sidecar metadata,
and allows a repeated apply when the target already equals the reviewed
replacement. Rebuild the manifest and rerun coverage after applying.

For baseline-only entries, print sidecar scaffolds before hand-authoring cue
variants:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-sidecar-scaffold
```

The scaffold is read-only and does not write sidecar files. It only includes
known catalog entries with no existing prompt variants, preserves the same-stem
sidecar filename, baseline seed/cue/score slots, source prompt hash, and a blank
`prompt_variant_template`. Fill the template with user-authored `append_cues` or
exact `text`; do not let the scaffold invent cue wording.

Before authoring alternates for a baseline-only pose, print a baseline score
sheet and grade the existing image/prompt pair:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-baseline-score-sheet
```

The baseline score sheet is also read-only. It preserves prompt/image paths,
prompt hashes, seed metadata, cue axes, and score slots for each baseline entry.
`needs_visual_score` means no score fields are filled yet; `partially_scored`
means some baseline gates are filled but the preservation assessment is not
complete. Use the sheet to decide whether a baseline is already valid, whether
the first sidecar variants should be small same-scene frame changes, or whether
the pose needs stronger structural prompt/control work before seed alternates.

After manually filling baseline scores, convert the scored sheet into a
baseline score update draft:

```bash
python tools/krea2_atlas_refine_manifest.py --print-baseline-score-update-draft --baseline-score-sheet-json /tmp/sxcp-baseline-score-sheet-scored.json
```

The draft records only top-level baseline metadata: `seed_metadata`,
`cue_axes`, `score`, `score_state`, source prompt hash, and manual analysis
notes. It skips unknown or fully unscored entries and deliberately does not
carry `prompt_variants`; prompt alternates stay owned by the sidecar
promotion/seed-selection path.

Validate the baseline score draft before writing sidecars:

```bash
python tools/krea2_atlas_refine_manifest.py --validate-baseline-score-update-draft --baseline-score-update-draft-json /tmp/sxcp-baseline-score-update-draft.json
```

Validation rejects sidecar filename drift, missing prompt hashes, empty score
updates, forbidden negative-conditioning fields, and any accidental
`prompt_variants` contamination. Partial or rejected baseline scores are allowed
as evidence but reported as warnings, so partial progress can be preserved
without pretending it is seedable.

Only after validation, apply the baseline score draft:

```bash
python tools/krea2_atlas_refine_manifest.py --apply-baseline-score-update-draft --baseline-score-update-draft-json /tmp/sxcp-baseline-score-update-draft.json --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

Baseline score apply is validation-first. It writes top-level sidecar
`seed_metadata`, `cue_axes`, `score`, `baseline_score_state`,
`baseline_source_prompt_sha256`, and `baseline_analysis_notes`, while preserving
any existing `prompt_variants` exactly. Rebuild the manifest after applying;
the baseline score sheet should then rescan the same entries as
`scored_pass`, `partially_scored`, or `scored_rejected` according to the saved
baseline evidence.

The manifest is the bridge between generated artifacts and the seed/cue system.
It records each prompt/image pair, validates the inferred `variant_key` against
the catalog, preserves prompt text and a prompt hash, records missing pairs, and
reserves seed slots for `sampler_seed`, `generator_seed`, `atlas_cue_seed`,
`micro_position_seed`, and `workspace_seed`. Fill those seed slots when the
source workflow exposes them; leave them null only when the historical artifact
does not contain that metadata. Cue-selection commands may target
`generator_seed`, `atlas_cue_seed`, `micro_position_seed`, or `workspace_seed`,
but never `sampler_seed`; that slot is reserved for the actual render sampler
seed returned by the job.

Optional same-stem JSON sidecars can enrich scanned entries without changing the
prompt/image filenames. For `pov_example_00001_.txt` and
`pov_example_00001_.png`, add `pov_example_00001_.json` with `seed_metadata`,
`cue_axes`, `score`, and `notes`. The manifest keeps explicit null slots for
missing fields so unfilled seed or score data is visible instead of silently
absent.

Image-only atlas reference pools sit upstream from sidecars. First make a
labeled contact sheet, cluster the references by visible micro-axis, then choose
the nearest atlas target for each proposed cue. Store those nearest targets in a
sidecar prompt variant's `reference_images` list so the reference provenance
travels through prompt batches, result sheets, promotion reports, sidecar
updates, catalog cue drafts, and later rescans. Prefer canonical curated
references such as `blowjob_top_view/22_blowjob_top_view.png` when a matching
curated frame exists; use `1.original/...` paths for supplemental raw-pool frames
that do not have a curated counterpart. This field is still provenance, not
proof: a generated prompt/image pair must show that the same cue preserves the
current subject, workspace, clothing ownership, and prompt-noise gates before the
cue is eligible for seed selection or catalog promotion. Do not promote an
image-only reference cue directly into the generator.

Before authoring top-view oral cue variants, print a read-only reference-pool
report:

```bash
python tools/krea2_atlas_refine_manifest.py --print-reference-pool-report --variant-key pov_blowjob_top_down_vertical_shaft --reference-pool-folder 1.original/blowjob_top_view_1024
```

The report compares the canonical catalog `atlas_folders` against supplemental
raw folders by image id. Use `catalog_reference_images` and matched canonical
paths as preferred `reference_images`; use `supplemental_extra_images` to mine
extra cue axes when a raw frame has no curated counterpart.

Then print a blank cue-review sheet from the same pool:

```bash
python tools/krea2_atlas_refine_manifest.py --print-reference-cue-review-sheet --variant-key pov_blowjob_top_down_vertical_shaft --reference-pool-folder 1.original/blowjob_top_view_1024
```

Fill `observed_positive_cues`, `cue_axes`, and `review_notes` only from visual
inspection. The sheet provides `reference_images_template` for canonical
catalog refs, but leaves it blank for raw-only supplemental extras so those
images start as cue-mining evidence rather than automatic sidecar references.

After filling the review sheet, print a candidate draft:

```bash
python tools/krea2_atlas_refine_manifest.py --print-reference-cue-candidate-draft --reference-cue-review-sheet-json /tmp/sxcp-reference-cue-review-filled.json
```

The candidate draft converts reviewed canonical rows into sidecar-ready
`prompt_variant` objects with `append_cues`, `reference_images`, `cue_axes`, and
seed slots. It skips blank rows, option/meta/negative prompt-noise cues, missing
variant ids, duplicate variant ids, and raw-only supplemental rows without a
canonical reference. Copy a candidate into a same-stem sidecar only after
choosing the baseline deck it should modify, then test it through the normal MCP
batch/result-sheet/promotion path.

To attach reviewed candidates to a same-subject baseline deck for testing,
print a sidecar authoring draft:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-reference-cue-sidecar-author-draft --reference-cue-candidate-draft-json /tmp/sxcp-reference-cue-candidate-draft.json --variant-key pov_blowjob_top_down_vertical_shaft
```

Validate and apply that draft only against the same folder:

```bash
python tools/krea2_atlas_refine_manifest.py --validate-reference-cue-sidecar-author-draft --reference-cue-sidecar-author-draft-json /tmp/sxcp-reference-cue-sidecar-author-draft.json
python tools/krea2_atlas_refine_manifest.py --apply-reference-cue-sidecar-author-draft --reference-cue-sidecar-author-draft-json /tmp/sxcp-reference-cue-sidecar-author-draft.json --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

The author draft is pre-test only. It writes unscored prompt variants into the
same-stem sidecar after checking the baseline prompt hash for drift. Rebuild the
manifest after applying; coverage should move from `baseline_only` to
`needs_visual_score`, and the next step is a fixed-seed MCP batch, not catalog
promotion.

Sidecars can also define explicit `prompt_variants` for seedable cue probes.
Each variant must provide an `id` and exactly one of `text` or `append_cues`;
the batch builder does not invent cue wording. Variant ids must be unique within
the sidecar because they are the stable identity for cue-seed selection, upsert
apply, and evidence roundtrips. If an explicit `prompt_source.prompt_variant_id`
is present, it must match the enclosing variant `id`. Use `append_cues` for small
micro-position changes that should read like another frame from the same scene,
such as a hand moving higher, feet moving farther forward, a body angle tilting,
or a workspace surface becoming more prominent. Keep every cue positive-only and
send batches through the normal `sxcp_eval_out` path:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-batch --variant-key pov_footjob_frontal_sole_stroke --sampler-seed 123
```

Validate the printed batch with `tools/sxcp_prompt_batch.py` before using
`run-batch --run`. A sidecar prompt variant is candidate evidence only until
its returned image is scored against the atlas reference and the same-subject
baseline.

After `run-batch --run` writes a result JSON, convert the exact batch/results
pair into a visual scoring sheet before editing sidecars or generator wording:

```bash
python tools/krea2_atlas_refine_manifest.py --print-result-sheet --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --notes "visual scoring pending"
```

The result sheet is not an automatic judge. It preserves the fixed sampler
seed, probe order, returned image paths, exact prompt text, cue-axis metadata,
and empty score slots for manual atlas comparison. Fill those slots only after
checking the generated images against the atlas reference for pose ownership,
workspace continuity, clothing visibility, subject identity, expression/eye
control, anatomy, and prompt noise.

After the result sheet is manually scored, print a promotion report before
editing a sidecar or generator route:

```bash
python tools/krea2_atlas_refine_manifest.py --print-promotion-report --result-sheet-json /tmp/sxcp-result-sheet-scored.json
```

The promotion report is intentionally conservative. A probe is only a
`seedable_candidate` when pose ownership, workspace continuity, clothing
visibility, subject identity, and prompt noise are all scored `pass`, and the
remaining visual axes at least show progress rather than failure. Missing scores
stay `needs_visual_score`; failed preservation gates stay `rejected`. The report
also scans candidate text with the prompt-noise audit and rejects noisy text even
if the manual `prompt_noise` score was filled as `pass`.

For ready candidates, print a sidecar update draft instead of editing the
sidecar directly:

```bash
python tools/krea2_atlas_refine_manifest.py --print-sidecar-update-draft --promotion-report-json /tmp/sxcp-promotion-report.json
```

The draft uses only exact tested prompt text from `seedable_candidate` probes.
It preserves the original same-stem sidecar filename, cue axes, seed metadata,
visual score evidence, and returned image path. Stable `matrix_evidence` carried
by a result-sheet probe is preserved through the promotion report and sidecar
draft. Explicit unstable `matrix_evidence` rejects the probe before promotion,
so the sidecar draft cannot replace a matrix-proven variant with weaker
single-batch evidence. Review the draft before copying anything into a sidecar;
rejected or unscored candidates are skipped.

Validate the draft before applying any sidecar edit:

```bash
python tools/krea2_atlas_refine_manifest.py --validate-sidecar-update-draft --sidecar-update-draft-json /tmp/sxcp-sidecar-update-draft.json
```

The validation gate rejects drafts with missing image evidence, failed
preservation scores, missing cue-axis movement, duplicate prompt-variant ids,
or forbidden negative-conditioning fields. Treat a failing validation report as
a prompt/evidence issue to resolve before editing sidecars.

Only after validation passes, apply the draft to the atlas-refine folder:

```bash
python tools/krea2_atlas_refine_manifest.py --apply-sidecar-update-draft --sidecar-update-draft-json /tmp/sxcp-sidecar-update-draft.json --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

Apply is validation-first and idempotent. It upserts `prompt_variants` by id
inside the target same-stem sidecar, preserves unrelated sidecar fields, rejects
ambiguous existing `prompt_variants` lists with missing or duplicate ids, and
does not touch rejected or unscored candidates.

After applying, rebuild the manifest and batch from the same folder as a
roundtrip check. The applied sidecar should rescan with the exact tested prompt
text, cue axes, seed metadata, visual evidence, and score evidence; the next
`--print-batch` output should regenerate the same tested prompt variant by id.
For matrix-proven sidecar variants, stable `matrix_evidence` stays attached to
the normal batch probe and the result sheet built from that batch.

For generator-style single-variant selection, use the cue seed selector rather
than relying on sidecar order:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-seed-selection --variant-key pov_footjob_frontal_sole_stroke --selection-seed 202 --seed-slot atlas_cue_seed
```

The selector is deterministic for the same seed and only chooses sidecar
variants with seedable visual evidence. Eligible variants are sorted by
`prompt_variant_id` before the seed is applied, so reordering a sidecar JSON file
does not change what a cue seed means. Unscored or rejected variants are listed
as ineligible instead of entering the seed pool. If a selected sidecar variant
has stable `matrix_evidence`, the selector keeps that matrix proof attached to
the selected candidate. Variants with no matrix evidence still use single-image
promotion evidence, but variants with explicit unstable matrix evidence are
ineligible until retested or corrected. Downstream reuse does not trust
`stable: true` by itself: malformed stable matrix evidence, such as duplicated
declared sampler seeds, non-matching matrix jobs, or cue-seed metadata that no
longer matches the matrix selection seed, is treated as unstable and kept out of
the seed pool.

To render the selected alternate frame through the normal MCP batch helper, use
the selected-batch exporter. It emits the baseline and selected candidate only:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-seed-selected-batch --variant-key pov_footjob_frontal_sole_stroke --selection-seed 202 --sampler-seed 101 --seed-slot atlas_cue_seed
```

Then validate/run it with `tools/sxcp_prompt_batch.py` as usual. The candidate
probe carries the selected cue seed in the requested seed slot plus the sidecar
evidence that justified the variant. Probe `seed_metadata.sampler_seed` is the
actual sampler seed for that render job; cue, micro-position, generator, and
workspace seed slots remain prompt-variant provenance. Do not use
`sampler_seed` as `--seed-slot`; the tooling rejects it so the cue seed cannot
overwrite the render seed. Matrix-proven variants also keep their full stable
`matrix_evidence` on the selected probe.

For repeatability checks across several sampler seeds and cue seeds, print a
seed matrix:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-seed-matrix --variant-key pov_footjob_frontal_sole_stroke --selection-seeds 202,203 --sampler-seeds 101,102 --seed-slot atlas_cue_seed
```

The matrix is read-only and sampler-major. Each job embeds a normal
seed-selected prompt batch with the baseline and selected candidate only, plus
the selected prompt variant id, exact candidate prompt text, cue seed, sampler
seed, cue axes, and evidence provenance. Use it to queue controlled repeats
where sampler seed changes image stochasticity and `atlas_cue_seed` changes the
selected alternate frame. Sampler seed lists and cue seed lists must contain
distinct values; duplicate seeds are rejected because they inflate apparent
repeatability without adding new evidence. A matrix with only one unique sampler
seed can still be inspected, but it cannot become stable sidecar evidence; stable
proof requires at least two unique sampler seeds before manual hard-pose
thresholds are considered.

After matrix jobs return images, create one matrix result sheet instead of
manually merging per-job notes:

```bash
python tools/krea2_atlas_refine_manifest.py --print-seed-matrix-result-sheet --seed-matrix-json /tmp/sxcp-seed-matrix.json --seed-matrix-results-json /tmp/sxcp-seed-matrix-results.json --notes "matrix scoring pending"
```

The matrix result sheet matches returned results by matrix job id, then reuses
the normal result-sheet format inside each job. It preserves sampler seed, cue
seed, selected prompt variant id, exact candidate prompt text, returned image
path, and empty score slots for manual atlas scoring. Missing or extra job ids
are errors because they break matrix comparability. Duplicate matrix job ids are
also rejected before result matching, since one returned image must not stand in
for two sampler/cue slots.

After manually scoring the matrix result sheet, print the matrix promotion
report:

```bash
python tools/krea2_atlas_refine_manifest.py --print-seed-matrix-promotion-report --seed-matrix-result-sheet-json /tmp/sxcp-seed-matrix-result-sheet-scored.json
```

The promotion report applies the same preservation gates as a single result
sheet, then groups jobs by selected prompt variant and cue seed. A group is
stable only when every declared sampler seed in that group is present, at least
two unique sampler seeds are covered, and every covered job is a
`seedable_candidate`; failed jobs keep their blockers, such as subject identity
or workspace continuity failures, attached to the group, omitted sampler seeds
add `missing_sampler_coverage`, and one-sampler groups add
`insufficient_sampler_coverage`. The declared `sampler_seeds` and
`selection_seeds` lists must not contain duplicates, because repeated
declarations can inflate apparent coverage without adding a new render or cue
seed. If `selection_seeds` is present, every job's `selection_seed` must be in
that declared cue-seed set. If `sampler_seeds` is present, every job's
`sampler_seed` must be in that declared render-seed set, and a selected-variant /
cue-seed group may contain each sampler seed only once. Promotion also rechecks
that each matrix result-sheet job has a unique id, so stale or hand-edited sheets
cannot inflate stable evidence by duplicating a job. The selected prompt variant
id recorded on the matrix job must also match the scored candidate prompt
variant id; a mismatch means the sheet no longer proves the selected alternate.
Every job's `seed_slot` must match the matrix result sheet's `seed_slot`, so
atlas-cue evidence cannot be mixed with workspace, generator, or micro-position
evidence in one stable group.
Jobs in the same selected-variant/cue-seed group must also keep the same
`variant_key`, `source_entry_id`, and `source_stem`, so evidence from another
pose family or atlas artifact cannot be folded into a stable sidecar candidate.
They must also keep the same exact candidate prompt text across sampler jobs;
the promotion report records a prompt-text hash for the group, and prompt-text
drift under the same variant id is rejected because it no longer proves one
repeatable wording.
Treat stable groups as repeatability evidence for a sidecar/catalog cue; treat
unstable groups as wording, coverage, or control work before promotion.

For stable groups, print a matrix sidecar update draft before editing sidecars:

```bash
python tools/krea2_atlas_refine_manifest.py --print-matrix-sidecar-update-draft --seed-matrix-promotion-report-json /tmp/sxcp-seed-matrix-promotion-report.json
```

The draft emits only stable groups. It preserves the same-stem sidecar filename,
the exact selected prompt variant text, prompt source provenance, representative
single-image evidence for compatibility, and `matrix_evidence` containing all
passing sampler seeds, returned image paths, visual scores, cue seed, and job
ids. Stable groups fail closed if any listed job id is missing from the promotion
report, so a hand-edited report cannot write sidecar evidence for jobs it does
not carry. A stable group may not repeat a `job_id`, because one returned image
must not count as multiple matrix samples. Each referenced job must still belong
to the stable group's identity: same selected prompt variant, cue seed, seed
slot, pose `variant_key`, `source_entry_id`, `source_stem`, and exact candidate
prompt text. Stable groups
also fail closed when their declared `sampler_seeds` do not match the sampler
seeds on their listed `job_ids`, so a draft cannot inflate repeatability
evidence by claiming an unreferenced render seed. Their `job_count`,
`promotion_ready_count`, and `blocked_count` must also match the referenced
jobs; emitted matrix evidence derives these counts from `job_ids` instead of
trusting editable group summaries. Unstable cue groups are listed as skipped
with their blockers and must not be copied into sidecars.
Even if a hand-edited promotion report marks a group stable, matrix sidecar draft
generation rejects groups whose referenced `job_ids` cover fewer than two unique
sampler seeds.

Validate a matrix draft with the matrix-specific gate before applying it:

```bash
python tools/krea2_atlas_refine_manifest.py --validate-matrix-sidecar-update-draft --matrix-sidecar-update-draft-json /tmp/sxcp-matrix-sidecar-update-draft.json
```

This gate rejects unstable matrix evidence, failed per-sampler visual scores,
missing or insufficient sampler coverage, forbidden negative-conditioning fields,
and mismatched cue seed metadata: the prompt variant's
`seed_metadata[matrix_evidence.seed_slot]` must match
`matrix_evidence.selection_seed`. It also rejects duplicated matrix evidence job
ids or sampler seeds, duplicated declared matrix sampler seeds, stable evidence
with fewer than two unique sampler seeds, and matrix evidence rows whose `turn`
is missing or not an integer. It also
checks the representative single-image evidence used for compatibility with the
normal seed selector, including requiring an integer `evidence.turn`. That
representative evidence must match the
`matrix_evidence.jobs` row for its `evidence.seed`, including image path, turn,
and visual score.

Only after validation passes, apply the matrix draft to the atlas-refine folder:

```bash
python tools/krea2_atlas_refine_manifest.py --apply-matrix-sidecar-update-draft --matrix-sidecar-update-draft-json /tmp/sxcp-matrix-sidecar-update-draft.json --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine
```

Matrix apply is validation-first and idempotent. It upserts prompt variants by
id, preserves unrelated sidecar fields, and keeps the full `matrix_evidence` so
later rescans still know which cue seed and sampler seeds proved the alternate
frame repeatable.

When a selected batch returns images, convert it with `--print-result-sheet`
like any other batch. The result sheet preserves the seed-selection report at
the sheet level and the selected prompt variant id on the candidate probe, so
manual visual scoring remains tied to the exact cue seed and sidecar alternate.
If the selected probe carried stable `matrix_evidence`, the result sheet keeps
that matrix proof beside the new image path and empty score slots.

For the generator node path, `atlas_cue_seed` on `SxCP Krea2 Pose Variant` and
the family-specific `SxCP Krea2 POV ... Filter` nodes selects among explicit
catalog `prompt_variant_cues` for the selected atlas variant. This is not the
sampling seed and it does not invent prompt wording. Use `-1` when the broader
generator pose seed should continue choosing cue-set alternates. Use a fixed
`atlas_cue_seed` when testing the same catalog alternate across subjects,
locations, or sampler seeds. The selected index is stored in
`krea2_prompt_variant_indices`, preserved through row building, and shown in the
node summary as `cue_indices=variant:index`.

To bridge scored sidecar alternates back toward the generator catalog, preserve
`append_cues` provenance through the refine loop. A full prompt win can remain a
sidecar `text` candidate, but only an explicit append-cue delta should become a
reviewable catalog `prompt_variant_cues` candidate. Print that review draft with:

```bash
python tools/krea2_atlas_refine_manifest.py --folder /media/unraid/comfyui/output/CodexMCP-Atlas-Refine --subject-id atlas_refine_same_woman_001 --print-catalog-cue-draft --variant-key pov_footjob_frontal_sole_stroke
```

The catalog cue draft is read-only. It skips unscored, rejected, or exact-text
only sidecar variants and emits only seedable append-cue candidates with visual
evidence, cue axes, seed metadata, and the exact tested prompt hash. Stable
matrix evidence is preserved on catalog candidates; explicit unstable matrix
or malformed stable matrix evidence is skipped and listed with an
`unstable_matrix_evidence` blocker. Review the draft manually before editing
`categories/krea2_pov_pose_variants.json`; do not infer catalog cue wording from
a whole prompt diff.

Use the manifest entries as baseline frames, not as proven generator fixes. For
each variant, score the current image/prompt against:

- atlas pose and contact ownership;
- same-subject identity preservation;
- workspace lounge consistency and surface relationship;
- clothing visibility and subject ownership;
- face/eye/expression retention when the face is visible;
- anatomy/proportion sanity;
- prompt noise, duplicate cues, and ambiguous ownership.

Only add seedable cue alternates after the baseline frame is understood. Store
alternates by axis, such as contact depth, hand position, foot position, body
angle, camera height, workspace surface, clothing visibility, expression/eye
detail, and anatomy shape detail. A seed change should feel like selecting
another frame from the same scene rather than random prompt drift.

## MCP Command Memory

Use the checked helper instead of ad hoc Python snippets for bridge calls. The
approved command prefix is:

```bash
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py
```

Common calls:

```bash
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py list-tools
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_pull --arguments-json '{"channel":"sxcp_eval_in"}'
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_push --arguments-json '{"channel":"sxcp_eval_out","seed":5656565656,"text":"PROMPT_ONLY_POSITIVE_CONDITIONING"}'
```

For batched prompt-axis search, prepare a JSON batch and use the offline command
renderer before touching the bridge manually:

```bash
python tools/sxcp_prompt_batch.py validate --batch-json /tmp/sxcp-batch.json
python tools/sxcp_prompt_batch.py print-push-commands --batch-json /tmp/sxcp-batch.json
python tools/sxcp_prompt_batch.py print-result-template --batch-json /tmp/sxcp-batch.json
python tools/sxcp_prompt_batch.py run-batch --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --previous-turn 80 --run
python tools/sxcp_prompt_batch.py validate-results --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json
python tools/sxcp_prompt_batch.py print-eval-entry-draft --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --variant-key pov_example_variant --baseline-image /absolute/baseline.png --candidate-id controlled_subject_first
```

Use `run-batch --run` for normal batch execution. It pushes one positive prompt,
polls `sxcp_eval_in` until the turn advances and an absolute PNG appears with
the fixed sampler seed, writes the filled result JSON, then sends the next
prompt. Omit `--run` for a dry-run command preview. Run `validate-results` after
the batch and before drafting evidence. It checks that every probe returned a
new ordered turn, an absolute PNG image path, and the same sampler seed as the
batch. This keeps batched prompt search as image-presence collection first and
bulk analysis second. The batch helper validates without stripping atlas
metadata such as cue axes, seed metadata, selection data, evidence, and stable
`matrix_evidence`.

Before drafting evidence, compare atlas references and generated images for
spatial orientation, not only limb/contact similarity. First decide the
atlas's surface and camera-height relationship, then check whether the
generated background supports the same read. Use the background as a
camera-height witness: ceiling, upper walls, and high partition lines usually
support a low viewer looking upward; floor, carpet, table tops, platform edges,
or furniture behind the body can reveal a higher camera, seated support, or a
different surface. If the atlas target has the viewer flat on his back or the
partner mounted over him, do not accept a candidate only because contact is
clear; the room geometry must also support that flat/low read. Reject the
candidate before generator mirroring when the background says the bodies are on
a different surface or at a different height than the atlas.

`print-eval-entry-draft` rejects `geometry_only` candidates by default. Use
`--allow-geometry-only` only when the entry is explicitly labeled as
non-controlled prompt-axis evidence rather than subject/look-controlled A/B
evidence.

Keep `sxcp_eval_out` prompt-only and positive-only. Do not use
`sxcp_eval_negative_out` for Krea2 tuning.

## Generator-Patch Evidence Matrix

Do prompt and image exploration before editing production generator wording. A
normal pose-wording generator patch needs all of this evidence first:

- at least three distinct source cases with different visible subjects;
- at least two sampler seeds, unless the source prompt is structurally wrong
  before rendering;
- location-family coverage when the proposed wording changes scene anchors;
- one baseline and one candidate per source case, with subject, location family,
  camera family, and sampler seed fixed inside each pair;
- positive-only candidate prompts, with no negative-conditioning phrases in the
  positive prompt.

A generated-route probe that works before the full matrix is useful evidence.
If it is the best tested improvement when leaving the category, it can become a
`provisional_generator_patch` with final prompt regression coverage. It should
not become a proven `generator_patch` decision until the matrix repeats and the
final generated prompt is regression-tested.

## Hard-Pose Exploration Budget

Use this budget for atlas poses where early prompt-only results repeatedly miss
the core spatial read.

- Define the failure threshold before the run. The default threshold is about
  fifty positive-only prompt tries across distinct wording axes before declaring
  the pose text-insufficient or moving it to a stronger-control bucket.
- Run the search in batches, usually six to twelve prompts at a time. Send each
  prompt through `sxcp_eval_out`, wait for the image path, then analyze the
  batch together instead of overreacting to one render.
- Keep a short axis ledger for each batch: intended wording axis, seed, source
  subject, best image, repeated failure mode, and words that literalized or
  harmed the result.
- Treat a small failed batch as direction, not a conclusion. If a batch shows a
  repeated failure such as head height, camera height, viewer/partner elevation,
  or background-plane mismatch, the next batch should vary that axis directly.
- Stop early only for a strong positive result that is worth repeating on a
  second source or seed, or for a hard technical blocker. A weak but improving
  result should feed the next wording batch rather than ending the category.
- If the threshold run finds a repeatable partial that is materially better
  than baseline, accept the partial target explicitly and mirror only that
  generator-safe improvement. Keep the route candidate and mark the evidence as
  needing expansion when the full atlas target is still unsolved.

## Current Fingering Test Pattern

The prior bedding-based fingering prompt is invalid as a general rule because
it solved a lower-foreground artifact by adding bedroom context to an office
scene. The corrected test pattern keeps the coworking location intact:

- baseline: generic POV fingering/manual-contact wording from the same source
  case;
- candidate: foreground hand first, open-thigh geometry second, visible woman
  face/torso third, office chair and coworking depth fourth;
- anchors: black office chair seat/arms, desk edge, laptop table corners, glass
  partitions, repeated desk rows, plants, tall-window depth;
- rejection trigger: any result that fixes contact by changing the scene family
  instead of improving the pose hierarchy.

## Improvement Log

- `2026-07-01`: Added large image-only atlas folders to the cue-expansion
  method after inspecting the canonical
  `/media/unraid/davinci/Qwen_edit_lora/POV/dataset_v2/blowjob_top_view`
  folder and the 27-image supplemental raw pool at
  `/media/unraid/davinci/Qwen_edit_lora/POV/dataset_v2/1.original/blowjob_top_view_1024`.
  The curated folder remains the preferred `reference_images` source when a
  matching frame exists; the supplemental pool defines extra allowed micro-axes
  before sidecar authoring. Cue wording still needs fixed-seed generated
  evidence before sidecar, catalog, or generator promotion.
- `2026-07-01`: Added atlas-22 image-to-prompt calibration for
  `pov_blowjob_top_down_vertical_shaft` after manual renders
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00100_.png` and
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00101_.png`
  produced the strongest verticality so far. The retained rule is not “remove
  background”; it is floor/support-plane scene translation plus positive
  upper-body-stack hierarchy. Viewer abdomen/thigh cues should remain lower-edge
  anchors, while face, hair crown, shoulders, upper chest or neckline, and hand
  carry the partner geometry. Keep phrases like `hips and ass stay visually
  secondary` as human scoring notes, not final positive prompt text.
- `2026-07-01`: Corrected the top-view oral anchor after manual sidecar renders
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00135_.png`,
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00136_.png`,
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00137_.png`, and
  `/media/unraid/comfyui/output/sxcp_accumulator/bwave_2/img_00139_.png`.
  The strongest same-seed verticality came from making the centered shaft and
  mouth contact the primary axis, then stacking the woman's face, hair crown,
  shoulders, upper chest, and hand around it. Abdomen/thigh/feet wording belongs
  after that as lower-frame evidence. Sparse floor-plane wording remains useful
  because deep coworking-room tails fight the overhead angle.
- `2026-07-01`: Added exact-text reference cue candidates for order-sensitive
  atlas tests. The reference cue review path can now carry
  `prompt_variant_template.text` through candidate draft, sidecar authoring, and
  prompt-batch export as `prompt_source.kind = text`. Use this for
  shaft-first/top-view oral calibration and other cases where append-cues would
  leave the older conflicting baseline hierarchy in front of the tested wording.
- `2026-07-01`: Added explicit sidecar prompt-variant batches for
  same-subject atlas refine decks. `krea2_atlas_refine_manifest.py` now keeps
  sidecar `prompt_variants` and can print an `sxcp_prompt_batch`-compatible
  positive-only probe batch for one catalog `variant_key`. Cue text must come
  from the sidecar as exact `text` or `append_cues`; the batch builder may
  combine and preserve seed/cue metadata, but it must not create new pose
  wording by itself.
- `2026-07-01`: Added atlas result-sheet generation after prompt batches return
  images through the MCP loop. The sheet keeps batch/result order, sampler seed,
  prompt text, image paths, cue axes, and unfilled score slots together so
  visual analysis can be written against the exact generated artifacts before
  sidecar promotion or generator patches.
- `2026-07-01`: Added conservative promotion reports for scored atlas result
  sheets. Reports recover the sidecar prompt variant id, keep cue/seed metadata,
  and classify candidates as `seedable_candidate`, `needs_visual_score`, or
  `rejected` using preservation gates for pose ownership, workspace continuity,
  clothing visibility, subject identity, and prompt noise. The report does not
  auto-edit sidecars or generator wording.
- `2026-07-01`: Added sidecar update drafts for seedable atlas candidates. The
  draft emits reviewable `prompt_variants` grouped by original same-stem sidecar
  filename, uses only exact tested prompt text, and carries cue axes, seed
  metadata, image evidence, and visual scores forward. It deliberately skips
  rejected or unscored candidates and does not write sidecar files.
- `2026-07-01`: Added sidecar update draft validation before any sidecar edit.
  The validator rejects drafts with missing cue-axis movement, missing image
  evidence, failed preservation scores, duplicate prompt-variant ids, or
  forbidden negative-conditioning fields, keeping sidecar promotion tied to
  exact scored artifacts rather than hand-cleaned prompt text.
- `2026-07-01`: Added validation-first sidecar draft apply. The apply command
  writes reviewed `prompt_variants` into same-stem sidecar JSON files, upserts
  by variant id so repeated applies do not duplicate variants, and preserves
  unrelated sidecar metadata.
- `2026-07-01`: Added applied-sidecar roundtrip evidence preservation. Manifest
  scanning now keeps prompt-variant `evidence` from sidecars and generated
  prompt batches carry that evidence forward, so a promoted seedable cue remains
  tied to the exact tested prompt, image path, and visual score evidence after
  it is written to the sidecar.
- `2026-07-01`: Added deterministic atlas cue-seed selection for applied
  sidecar variants. `--print-seed-selection` chooses a stable prompt variant for
  a seed slot such as `atlas_cue_seed`, but only from variants whose evidence
  passes the seedable-candidate preservation gates. Unproven sidecar variants
  are reported as ineligible so seed selection does not silently use weak cues.
- `2026-07-01`: Cue-seed selection now sorts eligible candidates by
  `prompt_variant_id` before indexing by seed. Reordering sidecar JSON no longer
  changes which alternate frame a given cue seed selects.
- `2026-07-01`: Manifest ingestion now rejects duplicate sidecar
  `prompt_variants[].id` values before selection or batching, keeping cue-seed
  identity and sidecar upserts unambiguous.
- `2026-07-01`: Manifest ingestion now also rejects sidecar variants whose
  explicit `prompt_source.prompt_variant_id` does not match the enclosing
  variant `id`, so provenance cannot point at a different cue.
- `2026-07-01`: Sidecar update validation now enforces the same
  `prompt_source.prompt_variant_id` identity rule for normal and matrix drafts
  before apply writes durable sidecar state.
- `2026-07-01`: Sidecar apply now also rejects ambiguous existing
  `prompt_variants` lists before upsert, so apply cannot silently preserve or
  rewrite duplicate prompt-variant ids.
- `2026-07-01`: Added seed-selected prompt batch export. The exporter turns a
  deterministic cue-seed selection into an `sxcp_prompt_batch`-compatible JSON
  with baseline plus the selected candidate only, preserving exact prompt text,
  selected seed-slot metadata, and evidence provenance for MCP evaluation.
- `2026-07-01`: Prompt batches now stamp the actual render sampler seed into
  every probe's `seed_metadata.sampler_seed`, including matrix jobs with sampler
  overrides, while preserving the other seed slots as cue/provenance metadata.
- `2026-07-01`: Cue-selection seed slots now explicitly reject `sampler_seed`
  in seed selection and matrix sidecar validation. The sampler seed remains the
  render seed; cue, generator, micro-position, and workspace slots carry prompt
  alternate provenance.
- `2026-07-01`: Added seed-matrix export for atlas alternates. The matrix builds
  normal seed-selected batches for every sampler-seed / cue-seed pair, keeping
  sampler stochasticity and `atlas_cue_seed` selection separate while preserving
  exact selected prompt text and visual evidence in each job.
- `2026-07-01`: Added seed-matrix result sheets. Completed matrix jobs can now
  be converted into one scoring artifact that preserves each job id, sampler
  seed, cue seed, selected variant, exact prompt text, returned image path, and
  empty score slots, while rejecting duplicate, missing, or extra matrix result
  ids.
- `2026-07-01`: Added seed-matrix promotion reports. Scored matrix jobs are
  aggregated with the same preservation gates as single-result promotion and
  grouped by selected prompt variant plus cue seed, marking groups stable only
  when every sampler seed passes. Promotion now rejects duplicate or missing
  result-sheet job ids, duplicate declared sampler or cue seeds, jobs outside
  declared sampler/cue seed sets, duplicate sampler jobs inside one cue group,
  selected/candidate prompt-variant id mismatches, job-level seed-slot drift, and
  variant/source-entry/stem drift before grouping stable evidence.
- `2026-07-01`: Added matrix sidecar update drafts for stable cue groups.
  `--print-matrix-sidecar-update-draft` skips unstable groups and emits reviewed
  sidecar prompt-variant updates with representative evidence plus full
  `matrix_evidence` across passing sampler seeds. Stable groups now fail closed
  if any referenced promotion-report job id is missing.
- `2026-07-01`: Added matrix sidecar validation and apply. Stable matrix drafts
  now have `--validate-matrix-sidecar-update-draft` and
  `--apply-matrix-sidecar-update-draft`, preserving full sampler/cue evidence
  through idempotent sidecar upserts instead of relying on manual copy-paste.
- `2026-07-01`: Preserved seed-selection metadata through selected-batch result
  sheets. After MCP returns images for a selected batch, the result sheet keeps
  the sheet-level selection report and candidate-level selected prompt variant
  id, preventing later visual scores from losing the cue seed that produced the
  frame.
- `2026-07-01`: Propagated stable matrix evidence through seed selection,
  selected-batch export, and selected-batch result sheets, so a matrix-proven
  sidecar alternate keeps its cue seed, sampler seeds, job ids, image paths, and
  score evidence attached during later single-frame retests.
- `2026-07-01`: Added a seed-selection gate for explicit unstable matrix
  evidence. Legacy single-image variants remain selectable, but a variant that
  carries `matrix_evidence` must have `stable: true` before it can enter the cue
  seed pool.
- `2026-07-01`: Extended the same matrix-evidence gate to catalog cue drafts and
  coverage. Stable matrix proof is preserved on generator-catalog cue candidates;
  explicit unstable matrix evidence blocks catalog-review readiness.
- `2026-07-01`: Preserved stable matrix evidence through normal prompt-batch
  exports and their result sheets, so regular all-variant retests keep the same
  repeatability proof as seed-selected retests.
- `2026-07-01`: Made `tools/sxcp_prompt_batch.py` metadata-preserving when it
  loads batches. Validation and runner paths keep atlas cue axes, seed metadata,
  selection data, evidence, and stable matrix evidence available to downstream
  scoring tools.
- `2026-07-01`: Preserved stable matrix evidence through normal promotion
  reports and sidecar update drafts, preventing regular single-batch retests from
  overwriting a matrix-proven sidecar variant with a metadata-poorer copy.
- `2026-07-01`: Blocked explicit unstable matrix evidence in normal promotion
  reports. A result-sheet probe carrying `matrix_evidence.stable: false` is
  rejected with `unstable_matrix_evidence` and skipped by sidecar update drafts.
- `2026-07-01`: Connected catalog atlas cue seeds to the generator node path.
  `atlas_cue_seed` on Krea2 pose/filter nodes now records deterministic
  `krea2_prompt_variant_indices` for explicit catalog `prompt_variant_cues`,
  and prompt row assembly preserves those indices instead of overwriting them
  with the broader pose seed. This makes catalog cue alternates reproducible as
  same-scene frame changes while keeping sampler seed and cue seed separate.
- `2026-07-01`: Preserved append-cue provenance through atlas-refine promotion
  and added a read-only catalog cue draft. Batch probes, result sheets,
  promotion reports, and applied sidecars now keep `prompt_source`, so
  `--print-catalog-cue-draft` can propose catalog `prompt_variant_cues` only
  from scored seedable append-cue deltas instead of inventing alternates from a
  full prompt.
- `2026-07-01`: Added an atlas-refine coverage report for live decks. The report
  counts baseline-only entries, unscored sidecar variants, seedable candidates,
  rejected variants, and catalog-cue-ready append-cue candidates, making the
  next MCP/scoring action explicit before changing generator/catalog wording.
- `2026-07-01`: Added read-only sidecar scaffolds for baseline-only atlas
  entries. `--print-sidecar-scaffold` emits same-stem sidecar filenames,
  baseline metadata slots, source prompt hashes, and a blank prompt-variant
  template so user-authored cue variants can be added without inventing wording
  or writing files automatically.
- `2026-07-01`: Added a read-only baseline score sheet for same-subject atlas
  decks. `--print-baseline-score-sheet` exports every baseline prompt/image
  pair with score slots and score state, separating fully unscored baselines
  from partially scored ones before sidecar variants or catalog cue alternates
  are promoted.
- `2026-07-01`: Added validation-first baseline score sidecar updates.
  Manually scored baseline sheets can now produce a baseline score update draft,
  validate it, and apply top-level `score`, seed, cue-axis, prompt-hash, and
  analysis-note metadata back into same-stem sidecars without carrying or
  modifying `prompt_variants`. Partial baseline progress is preserved as
  warning-level evidence instead of being promoted as a seedable alternate.
- `2026-07-01`: Added a read-only atlas prompt-noise report. The report scans
  baseline prompts and sidecar prompt-variant text/cues for option-list words,
  meta/policy instructions, leaked POV foreground cue labels, and
  positive-channel negative-conditioning phrases before those prompts become
  fixed-seed evidence.
- `2026-07-01`: Integrated prompt-noise findings into atlas coverage. A known
  entry with noisy baseline or sidecar prompt text now reports
  `needs_prompt_cleanup` before `baseline_only`, `needs_visual_score`, or
  seed-selection states, so noisy prompts cannot silently advance as repeatable
  seed/cue evidence.
- `2026-07-01`: Added a manual prompt-cleanup sheet for atlas prompt-noise
  findings. `--print-prompt-cleanup-sheet` groups issues by editable source
  text, points baseline issues to prompt files and sidecar issues to
  same-stem JSON prompt variants, and leaves `replacement_text` blank so cleanup
  remains human-reviewed rather than generated by the tooling.
- `2026-07-01`: Added validation-first prompt-cleanup apply. Filled cleanup
  sheets can now be validated for nonblank/noise-free manual replacements and
  applied to prompt files or targeted sidecar variant text/cues while preserving
  unrelated sidecar metadata and rejecting drift.
- `2026-07-01`: Added `current_text_sha256` to prompt-cleanup sheet items and
  validation. Manual cleanup artifacts now prove their editable source text was
  not altered inside the sheet before replacement text is applied.
- `2026-07-01`: Added `source_prompt_sha256` to prompt-cleanup sheet items and
  validation. Manual cleanup artifacts now stay tied to the exact atlas baseline
  prompt identity used by batch, sidecar, and promotion evidence.
- `2026-07-01`: Seed matrices now reject duplicate sampler or cue seeds before
  jobs are emitted, so stable matrix evidence cannot be inflated by repeated
  copies of the same generated condition.
- `2026-07-01`: Seed-matrix promotion now requires each stable cue group to cover
  every declared sampler seed. Edited or incomplete matrix result sheets report
  `missing_sampler_coverage` instead of promoting partial evidence.
- `2026-07-01`: Stable matrix evidence now requires at least two unique sampler
  seeds. One-sampler matrices remain inspectable but report
  `insufficient_sampler_coverage`, and sidecar validation/draft generation reject
  hand-edited stable evidence below that repeatability floor.
- `2026-07-01`: Matrix sidecar drafts now verify stable groups' declared
  `sampler_seeds` against their referenced `job_ids`, so hand-edited promotion
  reports cannot write sidecar evidence that claims unreferenced render seeds.
- `2026-07-01`: Matrix sidecar drafts now reject stable groups whose
  `job_count`, `promotion_ready_count`, or `blocked_count` drift from their
  referenced jobs, and emitted matrix evidence uses job-derived counts.
- `2026-07-01`: Matrix sidecar drafts now reject stable groups with duplicated
  `job_ids`, so one returned matrix image cannot be counted as multiple
  repeatability samples.
- `2026-07-01`: Matrix sidecar drafts now reject stable groups whose referenced
  jobs drift from the group's selected prompt variant, cue seed, seed slot, pose
  variant, or source sidecar identity.
- `2026-07-01`: Matrix sidecar validation now rejects duplicated
  `matrix_evidence.jobs` ids and duplicated per-job sampler seeds, so a manually
  edited sidecar draft cannot count one evidence row twice.
- `2026-07-01`: Matrix sidecar validation now rejects duplicated declared
  `matrix_evidence.sampler_seeds`, keeping declared render-seed coverage aligned
  with the unique matrix evidence rows.
- `2026-07-01`: Matrix sidecar validation now rejects cue seed drift between
  `seed_metadata[matrix_evidence.seed_slot]` and
  `matrix_evidence.selection_seed`.
- `2026-07-01`: Matrix sidecar validation now requires representative
  single-image `evidence` to match the `matrix_evidence.jobs` row for its
  sampler seed, keeping normal seed-selector compatibility evidence tied to the
  matrix proof.
- `2026-07-01`: Downstream seed selection and catalog cue drafts now treat
  malformed stable `matrix_evidence` as unstable, so `stable: true` alone cannot
  reintroduce hand-edited matrix proof after sidecar rescan.
- `2026-07-01`: Downstream matrix evidence reuse now also requires
  `seed_metadata[matrix_evidence.seed_slot]` to match
  `matrix_evidence.selection_seed`, so cue-seed metadata drift cannot enter seed
  selection after sidecar rescan.
- `2026-07-01`: Matrix sidecar validation and downstream stable-evidence reuse
  now reject matrix evidence job rows whose `turn` is missing or not an integer.
- `2026-07-01`: Matrix sidecar validation now rejects representative
  single-image `evidence.turn` values that are missing or not integers before
  comparing them to matrix evidence rows.
- `2026-07-01`: Extended prompt-noise detection to exact repeated direct
  phrases. Duplicate pose clauses now surface as `duplicate_phrase` cleanup
  issues before they can be scored, promoted, or used for seed selection.
- `2026-07-01`: Added a promotion-time prompt-noise gate. Result-sheet
  candidates carrying option/meta/negative/duplicate prompt noise are rejected
  with `prompt_noise_issue` even when manual visual scores mark prompt noise as
  pass.
- `2026-07-01`: Added same-subject atlas refine deck ingestion after
  `/media/unraid/comfyui/output/CodexMCP-Atlas-Refine` was prepared with one
  prompt/image pair per atlas variant for a stable subject. Future seed/cue
  tuning should first build a manifest with
  `tools/krea2_atlas_refine_manifest.py`, confirm every prompt/image pair maps
  to a catalog `variant_key`, and use the manifest's seed slots to distinguish
  sampler, generator, atlas-cue, micro-position, and workspace-surface changes.
  This makes cue seeds behave like alternate frames from the same scene rather
  than uncontrolled prompt drift.
- `2026-07-01`: Added explicit `--print-manifest` support to the atlas-refine
  CLI. The default no-mode output still prints the manifest, but scripts and
  notes can now request that artifact by name like the other report modes.
- `2026-07-01`: Added atlas detail-restore hygiene after side-profile oral
  clothing restore preserved the shirt/bralette but emitted ownerless wording
  and earlier leaked `POV foreground clothing cue` into strict atlas prompts.
  Future atlas restores must audit the final Krea prompt, strip raw foreground
  clothing/body cue clauses, keep restored clothing explicitly partner-owned,
  and preserve partial-removal semantics without making hidden lower garments
  visible. For side-profile oral body-line, use `the woman wears ...` for
  visible clothes and keep lower garments `pulled aside out of frame` so the
  adult male viewer's abdomen/navel/pelvis/near thigh remain the only
  foreground body-owner cues.
- `2026-06-30`: Added side-camera/result-label separation after ballsucking
  seed `5757575757` produced attractive low side-camera oral views while still
  collapsing the requested contact object onto the shaft/glans. Future scoring
  should record that as side-view oral evidence and keep target-contact evidence
  separate.
- `2026-06-30`: Added generated-route validation discipline after footjob turn
  `183` kept large foreground soles but hid the shaft/contact that manual probes
  had preserved. Future provisional generator patches should render the exact
  final Krea prompt once after the code change; if shared route wording adds
  limiting positive-channel language, clean it before sending the validation
  prompt.
- `2026-06-30`: Added a hard-pose exploration budget after ballsucking wording
  tests produced only eight early probes before the first weak-case note. Future
  hard text-only poses should use batched wording-axis search and aim for about
  fifty positive-only tries before concluding the pose needs stronger control.
- `2026-06-30`: Added partial-acceptance discipline after ballsucking produced
  repeatable tongue/lips-on-testicles results that beat the shaft/glans
  baseline but did not fully solve mouth-wrapped contact. Future hard-pose exits
  should preserve repeatable progress as a provisional generator patch while
  keeping the remaining miss in the expansion queue.
- `2026-06-30`: Added ballsucking target-object refinement after sampler seed
  `9797979797` repeated the `scrotal skin is the nearest mouth surface` branch
  on turns `288` and `293`. Score target-object ownership separately from the
  side-low camera family: a route can preserve face/thigh geometry while still
  drifting to shaft/base contact. Avoid promoting balls-first center-object
  wording when it creates multi-subject or body-layout artifacts.
- `2026-06-30`: Added ballsucking generated-route validation after sampler seed
  `9898989898` repeated the patched scrotal-skin route on turns `296` and
  `297`. Validation can accept a provisional target-object improvement while
  still keeping the pose queued when the remaining miss is full mouth-wrapped
  testicle contact.
- `2026-06-30`: Added ballsucking fresh weak-case evidence after sampler seed
  `5959595959` tested lip-oval, sideways mouth pocket, and chin-pelvis upward
  seal wording across three women. The batch preserved low-pelvis/cheek-thigh
  geometry in places, but every branch returned to shaft/glans collapse or
  generic oral contact. Do not retry those axes as generator defaults; the next
  search should change the target-object control strategy rather than adding
  more mouth-shape synonyms.
- `2026-06-30`: Added ballsucking occlusion weak-case evidence after sampler
  seed `6060606060` tested foreground occlusion, under-scrotum tongue shelf,
  and hand-guided scrotum wording across three women. The generated route
  remained the best partial while those axes became shaft-centered or
  hand/shaft-dominant. Do not retry occlusion or hand-support synonyms as
  generator defaults; the next useful move is a different target-object strategy
  or stronger control.
- `2026-06-30`: Added ballsucking mouth-axis mixed-case evidence after sampler
  seed `6161616161` tested exact mouth-sucking, single-testicle, hanging balls
  below shaft, side-mouth wrap, and chin-pelvis lower-mouth wording across
  three women. The generated-route controls stayed the best repeated partials
  on two subjects, side-mouth and chin-pelvis variants produced isolated useful
  partials, and the rest drifted back to shaft/glans contact. Record isolated
  partials as axis hints, but do not patch generator wording unless a branch
  repeats across subjects or beats the generated-route controls.
- `2026-06-30`: Added ballsucking pelvis-valley weak-case evidence after
  sampler seed `7171717171` tested flat pelvis-valley, thigh tunnel,
  pubic-hair mouth-line, low-cushion chin-anchor, and pelvis-edge target-first
  wording across three women. The flat pelvis-valley branch repeated a strong
  body-plane correction on three subjects, matching the atlas viewer-flat
  thigh-wall read better, but it stayed shaft-centered. Score body-plane
  orientation and target-object contact separately; do not patch a route when
  it improves orientation while regressing the target.
- `2026-06-30`: Stopped the ballsucking text-only loop after sampler seed
  `7272727272` combined `flat-valley scrotal-skin` target wording with the
  prior side-low route across three women. The hybrid repeated the body-plane
  hint on turns `368`, `374`, and `380`, but the target stayed shaft-centered,
  while side-low flat-valley variants only gave look hints. Preserve the
  current side-low scrotal-skin partial, do not patch the hybrid axes, and move
  future full-target work toward stronger pose/control evidence rather than
  more positive-prompt synonyms.
- `2026-06-30`: Promoted blowjob side-profile POV after sampler seed
  `5858585858` produced a three-woman generated-route repeat on turns `298`,
  `301`, and `304`. When the current generated route repeats across multiple
  subjects on a fresh seed and alternate branches do not beat it cleanly, mark
  the route proven instead of continuing to queue it. Keep attractive
  side-camera-style self-body crop results as a separate look branch when they
  risk drifting toward external side framing.
- `2026-06-29`: Added the multisource/generator-safe method after an overfit
  single-character coworking test produced a visually usable but invalid
  bedding foreground. Future A/B runs must test at least two source cases before
  promoting wording that is meant to become a durable guide or generator rule.
- `2026-06-29`: Added generator mirroring discipline after the accepted
  fingering wording proved Krea2 behavior but not generator output. Future
  mirroring changes need a red-green regression at final Krea formatter output,
  not just a guide entry.
- `2026-06-29`: Tightened generator-patch promotion after the fingering
  generated-route probe looked good but had too little image coverage. Future
  pose-wording generator edits need a broader seed, subject, and location matrix
  before production route code changes.
- `2026-06-29`: Added semantic-axis discipline after source 52 fingering tests.
  If a candidate succeeds by changing ownership, viewpoint, location family, or
  role semantics, record it as a weak-case or prompt note unless that semantic
  change is the intended generator behavior. Do not count it as direct evidence
  for the original route even when the image is visually cleaner.
- `2026-06-29`: Added provisional generator-patch discipline after the user
  clarified that leaving a category should still carry forward same-seed progress
  over baseline. Future category exits should patch the generator with the best
  generator-safe improvement, record it as `provisional_generator_patch`, and
  keep the catalog route as `candidate` until repeated evidence proves it.
- `2026-06-29`: Applied the category-exit rule to spread/open-thigh presentation
  after two source subjects improved on the same sampler seed. For setup poses
  that are not structurally broken before rendering, prefer at least two source
  subjects before mirroring a provisional generator patch, and keep the
  observation explicit about remaining weak points such as insufficient V-frame
  width or outfit closure.
- `2026-06-29`: Applied the same category-exit rule to blowjob top-view after
  two source subjects improved on sampler seed `4242424242`. When the baseline is already usable,
  record the improvement narrowly: name the axis that got better, keep the route
  candidate, and avoid overstating the finding as proven until another seed
  repeats it.
- `2026-06-29`: Corrected blowjob top-view criteria after atlas review and a
  same-seed source-`46` probe showed that vertical shaft alignment alone can
  still render as frontal/eye-height oral. Future top-view evidence must show
  steep overhead camera geometry: viewer abdomen at the lower edge, camera
  looking down from above the viewer chest/abdomen, and the woman's hair crown,
  shoulders, and hands visible from above.
- `2026-06-29`: Refined blowjob top-view prompt-axis search after the user
  rejected horizontally biased probes. Run several prompt-only probes before
  editing the generator, wait for `sxcp_eval_in` to advance to the new turn, and
  compare each image against the atlas verticality criteria. The useful axis is
  `nadir-angle` or `bird's-eye` plus standing male POV, nearby floor plane
  dominating the image, the woman directly below between the viewer's feet, and
  top-down office anchors. Avoid `plumb-line` and `map` in generator prompts
  because Krea2 can literalize them as drawn graphics.
- `2026-06-29`: For quick wording-axis search, prefer a batched prompt-probe
  loop before analysis-heavy iteration. Prepare several positive-only alternate
  prompts that isolate likely wording axes, send them one at a time through
  `sxcp_eval_out` with the same sampler seed, pull only until each new
  `sxcp_eval_in` turn and image path exists, then inspect the returned images as
  a batch. Use the bulk comparison to pick the best axis, identify literalized
  or harmful words, and only then update the generator, guide, catalog, or eval
  log.
- `2026-06-29`: Preserve prompt-order controls when testing anything beyond
  rough pose-axis discovery. Prompts that start with pose geometry and omit or
  move the subject/look block can reduce female-look adherence, so treat those
  runs as geometry-only probes. Durable A/B prompts should keep the original
  subject/look description first, then the pose hierarchy, then location and
  style/background anchors, unless the test is explicitly about prompt-order
  sensitivity.
- `2026-06-29`: Added result-validation discipline to the batched prompt helper.
  After sending a batch, fill the result template from `sxcp_eval_in`, run
  `validate-results`, and only then draft evidence. The validation step proves
  each probe returned an ordered turn, an absolute PNG artifact, and the fixed
  sampler seed before bulk analysis or log-entry drafting.
- `2026-06-29`: Added `run-batch` automation to the batched prompt helper. It
  removes manual push/pull copy-paste from normal A/B runs while keeping the same
  gates: positive-only prompts, fixed sampler seed, turn advancement, absolute
  PNG image path, and `validate-results` before evidence drafting.
- `2026-06-29`: Split missionary subcases after turns `77`-`84`. Turns `76` and
  `80` are valid angled/cushion missionary results, not failures. The flatter
  atlas examples need a different positive axis: woman flat across an elevated
  table/platform, viewer standing or braced at the foot edge, and viewer feet,
  shins, or side-dropping legs placed below the support edge. Patch this only
  into the raised-edge/edge-supported route; keep generic missionary available
  for angled valid views.
- `2026-06-29`: Folded-missionary tuning on seed `8989898989` used two
  subject-first batches before code changes. Turns `85`-`88` showed that
  compact knee-block and vertical-thigh-column wording can produce the folded
  high-leg geometry, but the shaft/contact disappears when knees and feet lead
  the hierarchy. Turns `89`-`92` then tested contact-first variants; turn `89`
  was accepted because it placed the viewer lower abdomen and large centered
  shaft/contact before the compact folded-knee block. This confirms the
  method: use the first batch to identify the failed axis, run a targeted
  second batch, then mirror only the accepted generator-safe hierarchy as a
  provisional patch.
- `2026-06-29`: Frontal cowgirl on seed `8989898989` used a baseline-plus-
  variants batch instead of comparing against a previous category. Turn `93`
  was a valid generic cowgirl baseline, so turn `95`'s wide horizontal thigh
  bridge improvement became a prompt-guide rule rather than a generator patch.
  When the baseline already hits the pose, record the useful atlas refinement
  and leave the generator unchanged unless repeated evidence shows a systemic
  weakness.
- `2026-06-29`: Cowgirl-alt on seed `8989898989` exposed a spatial-orientation
  blind spot. Turns `97`-`100` had readable contact and squat-like knees, but
  the background still read as a platform/high-camera setup. After rechecking
  the atlas, turns `101`-`104` tested flat-supine viewer wording with ceiling
  and upper-room cues; turn `104` was accepted. Future pose analysis must
  compare atlas and generated room geometry before accepting an image.
- `2026-06-29`: Reverse cowgirl on seed `8989898989` showed that a correct
  semantic label such as `facing away` can be ignored when the visual hierarchy
  still resembles frontal cowgirl. Future back-facing straddle tests should
  score facing direction before contact quality and should name the back, hips,
  and ass as the nearest largest shapes before viewer-leg and contact details.
  Treat over-shoulder glances as secondary refinements only after the
  back-facing straddle is already locked.
- `2026-06-29`: Reverse-cowgirl-alt on seed `8989898989` confirmed that atlas
  sibling folders can need separate generator routes even when the baseline is
  already valid. Normal reverse cowgirl is close back/hip dominant; reverse-alt
  is upright seated with vertical back/shoulders and viewer hands or thighs
  forming the lower frame. Keep those prompt hierarchies separate instead of
  merging all back-facing woman-on-top evidence into one route.
- `2026-06-29`: Added non-target-viewpoint discipline after blowjob side-profile
  oral produced an attractive side-camera result on seed `5656565656`. If a
  render is visually useful but reads as a different camera family, record it as
  a weak case for a future route and do not mirror it into the current POV
  generator path.
- `2026-06-29`: Added MCP command memory after repeated context loss around the
  bridge workflow. Future A/B calls should use the checked helper command
  `/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py ...`, with
  `comfy_push` to `sxcp_eval_out` for prompt-only positive conditioning and
  `comfy_pull` from `sxcp_eval_in` for returned prompt/image/seed payloads.
- `2026-06-29`: Added side-profile oral ownership discipline after source `46`
  improved with explicit adult-male foreground ownership while source `47`
  rejected a related `body-axis` cue by transferring the body surface to the
  woman. Future side-profile tests should name the foreground owner repeatedly
  and verify that the woman's body stays lateral before considering any
  generator mirroring.
- `2026-06-30`: Promoted the side-profile oral lateral-edge body-line axis
  after sampler seed `9753197531` repeated it across two visible women. Pure
  male-body-axis wording can expose the male as a photographed subject or let
  Krea2 transfer the central body surface away from the intended first-person
  view. Future generator patches should combine adult-male foreground ownership
  with explicit lateral entry from the left edge, mouth at the male abdomen
  line, and hand under the lips; keep the route provisional until another
  seed/source expansion repeats it.
- `2026-06-30`: Added side-profile oral generated-route contact validation
  after turn `206` kept the male body-line geometry but let the mouth float
  above the shaft while the hand became the contact anchor. Turn `207` improved
  after adding lips-touching and mouth-to-shaft-contact priority. Future
  generated-route validation for oral side-profile should score both viewpoint
  ownership and which body part actually anchors the contact.
- `2026-06-30`: Added the side-profile oral lower-right torso anchor after
  sampler seed `9595959595` repeated it on turns `279` and `283` across two
  visible women. The useful wording makes the adult male viewer's own torso
  start at the lower edge and run diagonally into the lower-right foreground,
  with navel, abdomen hair, pelvis, and near thigh marking the camera owner's
  body. Prefer this over generic body-axis wording, which can expose the male
  as a photographed side subject or transfer the axis onto the woman.
- `2026-06-30`: Added side-profile oral generated-route validation after
  sampler seed `9696969696` repeated the patched route on turns `284` and
  `285`. Count generated-route validation separately from prompt-axis search:
  it proves the formatter can carry the new wording, while promotion still
  requires broader source/seed evidence.
- `2026-06-30`: Promoted normal frontal cowgirl from guide-only to provisional
  generator patch after seed `2828282828` repeated the wide-thigh bridge axis
  across two visible women. When the baseline is already valid, a generator
  patch is still appropriate if a later seed repeats a narrow atlas refinement
  that improves geometry without harming subject/look, contact, or setting.
  Generated-route turn `216` validated the patched formatter route with viewer
  hands on outer thighs, wide foreground thigh bridge, upright torso, centered
  contact, and coworking depth. Keep the route candidate until another
  source/seed repeats the refinement.
- `2026-06-29`: Applied the category-exit rule to blowjob laying frontal after
  source `46` and source `50` improved on sampler seed `6767676767`. When
  baselines are already strong, preserve the exact improved axis: wide V-frame and low-horizontal torso hierarchy, while noting residual high-hip posture and
  keeping the generator patch provisional until another seed repeats it.
- `2026-06-29`: Applied the category-exit rule to blowjob sitting upright after
  source `46` and source `50` improved on sampler seed `7878787878`. When a
  baseline preserves the seated pose but floats the face above the contact
  point, prefer low-mouth seated hierarchy over generic `mouth aligned` wording:
  face lowered to the exact center contact point, open mouth covering the
  centered tip, and hands directly at the base. Record outfit looseness/drift as
  residual risk and keep the generator patch provisional until another seed
  repeats it.