# Krea2 A/B Methodology Memory This file is the persistent memory for SxCP Krea2 prompt A/B methodology. Update it whenever the testing method improves. ## Current Method Version: `2026-06-30-generated-route-validation-positive-channel-cleanup` 1. Pull or construct the baseline from an actual SxCP/CodexMCPTest source case. 2. Keep the sampler seed fixed across the baseline and candidate. 3. Keep subject, location family, camera family, and target pose fixed unless the experiment explicitly tests one of those axes. 4. Change one prompt variable at a time when possible, usually the visual hierarchy for the target contact or pose. 5. Keep `sxcp_eval_out` positive-only. Do not place negative-conditioning phrases in the visible prompt. 6. Use location-compatible anchors only. For coworking/office scenes, use chair edge, desk edge, laptop table, glass partitions, repeated desk rows, plants, and window depth instead of bedroom or bedding anchors. 7. Treat a manual prompt win as proof that Krea2 responds to the wording, not proof that the SxCP generator already emits it. 8. Mirror a prompt win into the generator as a provisional improvement when leaving a category if same-seed evidence shows it improves over baseline and the wording is generator-safe. Keep the route `candidate` until the broader generator-patch evidence matrix proves it. 9. When a subject-first batch preserves appearance but repeatedly misses the atlas body plane, record it as weak-case evidence and consider stronger control before adding more generator text. 10. Score spatial orientation against the atlas before accepting evidence, and treat a contradictory room/background read as a rejection even when contact or limb placement is clear. Use background cues to decide whether the viewer or partner is high, low, standing, seated, supine, or on a support before grading pose/contact quality. 11. For hard text-only pose families, set an exploration budget before calling the route weak or deciding it needs stronger control. Eight prompt probes are only an early signal. Use batched wording-axis probes and aim for about fifty positive-only tries across meaningful axes before concluding that prompt text cannot reliably express the pose. 12. Do not require a perfect atlas hit before carrying progress forward. After the exploration budget, a repeatable partial that beats the baseline failure mode can become an accepted provisional generator improvement while the remaining miss stays documented for later seed/source expansion. 13. After patching generator wording, render one prompt produced by the actual code path before closing the category. Manual prompt-axis wins are not enough; the generated route can still drop the key contact hierarchy or add limiting positive-channel wording. 14. Treat the final prompt that Krea receives and the rendered generator images as the source of truth. Tests are guardrails for known failures, not proof that the prompt will render the intended pose. If a test passes while the rendered image or prompt audit shows drift, update the test and the method rather than trusting the test. 15. Atlas/catalog prompt wording must use direct visual prompt sentences. Do not send option-list wording such as `or`, `may`, `optionally`, or `either` in pose cues, and do not append meta instructions such as `keep the visible partner...` or generic camera-layout prose to atlas routes. Krea2 is not an instruction-following LLM in this loop; prompt text should describe the image, not explain a policy. 16. Do not promote generator edits from a few isolated renders unless the final generated prompt is structurally wrong before rendering. For pose wording changes, collect prompt/image evidence across multiple women, source cases, and seeds when feasible, then patch only the repeated generator-safe hierarchy. Keep early wins as prompt-guide or provisional evidence. ## Promotion Gates - One clean fixed-seed A/B can be recorded as evidence for that source case. - A prompt-guide rule needs repeated evidence across distinct subjects, locations, or seeds, unless the generated prompt is structurally wrong before rendering. - A catalog variant remains candidate until the rule repeats under controlled conditions. - A provisional generator patch is allowed when leaving a category if the best tested wording improves over baseline on a fixed seed. It should preserve the selected subject, outfit, location, and camera semantics, and it must not patch in a scene workaround that only solved one render. - A proven/default generator patch still needs the broader evidence matrix below, unless the generated prompt is structurally wrong before rendering. ## Generator Mirroring After a manual A/B prompt win, do not assume the SxCP generator mirrors the wording. Add a failing regression against the final formatter output first, then patch the narrow route boundary that owns the wording. The regression should assert the accepted hierarchy terms and reject the failure mode that caused the bad render, such as scene-incompatible anchors or negative-conditioning text in the positive prompt. After the route patch, run a generated-route probe through `sxcp_eval_out` with the same sampler seed when feasible. Use the actual formatter output, not a hand-normalized prompt. If the generated route regresses compared with the manual prompt-axis winner, record the failed generated-route image as the baseline, tighten the route wording, and validate again before logging the candidate as generated-route evidence. For location-specific wins, split the implementation: - the action or role graph owns the pose/contact hierarchy; - the final Krea formatter owns scene-compatible anchor expansion because it can see the selected scene, camera, and composition; - existing route phrases that downstream tests rely on should be preserved inside the stronger wording when they do not conflict with the A/B evidence. ## MCP Command Memory Use the checked helper instead of ad hoc Python snippets for bridge calls. The approved command prefix is: ```bash /media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py ``` Common calls: ```bash /media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py list-tools /media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_pull --arguments-json '{"channel":"sxcp_eval_in"}' /media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_push --arguments-json '{"channel":"sxcp_eval_out","seed":5656565656,"text":"PROMPT_ONLY_POSITIVE_CONDITIONING"}' ``` For batched prompt-axis search, prepare a JSON batch and use the offline command renderer before touching the bridge manually: ```bash python tools/sxcp_prompt_batch.py validate --batch-json /tmp/sxcp-batch.json python tools/sxcp_prompt_batch.py print-push-commands --batch-json /tmp/sxcp-batch.json python tools/sxcp_prompt_batch.py print-result-template --batch-json /tmp/sxcp-batch.json python tools/sxcp_prompt_batch.py run-batch --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --previous-turn 80 --run python tools/sxcp_prompt_batch.py validate-results --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json python tools/sxcp_prompt_batch.py print-eval-entry-draft --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --variant-key pov_example_variant --baseline-image /absolute/baseline.png --candidate-id controlled_subject_first ``` Use `run-batch --run` for normal batch execution. It pushes one positive prompt, polls `sxcp_eval_in` until the turn advances and an absolute PNG appears with the fixed sampler seed, writes the filled result JSON, then sends the next prompt. Omit `--run` for a dry-run command preview. Run `validate-results` after the batch and before drafting evidence. It checks that every probe returned a new ordered turn, an absolute PNG image path, and the same sampler seed as the batch. This keeps batched prompt search as image-presence collection first and bulk analysis second. Before drafting evidence, compare atlas references and generated images for spatial orientation, not only limb/contact similarity. First decide the atlas's surface and camera-height relationship, then check whether the generated background supports the same read. Use the background as a camera-height witness: ceiling, upper walls, and high partition lines usually support a low viewer looking upward; floor, carpet, table tops, platform edges, or furniture behind the body can reveal a higher camera, seated support, or a different surface. If the atlas target has the viewer flat on his back or the partner mounted over him, do not accept a candidate only because contact is clear; the room geometry must also support that flat/low read. Reject the candidate before generator mirroring when the background says the bodies are on a different surface or at a different height than the atlas. `print-eval-entry-draft` rejects `geometry_only` candidates by default. Use `--allow-geometry-only` only when the entry is explicitly labeled as non-controlled prompt-axis evidence rather than subject/look-controlled A/B evidence. Keep `sxcp_eval_out` prompt-only and positive-only. Do not use `sxcp_eval_negative_out` for Krea2 tuning. ## Generator-Patch Evidence Matrix Do prompt and image exploration before editing production generator wording. A normal pose-wording generator patch needs all of this evidence first: - at least three distinct source cases with different visible subjects; - at least two sampler seeds, unless the source prompt is structurally wrong before rendering; - location-family coverage when the proposed wording changes scene anchors; - one baseline and one candidate per source case, with subject, location family, camera family, and sampler seed fixed inside each pair; - positive-only candidate prompts, with no negative-conditioning phrases in the positive prompt. A generated-route probe that works before the full matrix is useful evidence. If it is the best tested improvement when leaving the category, it can become a `provisional_generator_patch` with final prompt regression coverage. It should not become a proven `generator_patch` decision until the matrix repeats and the final generated prompt is regression-tested. ## Hard-Pose Exploration Budget Use this budget for atlas poses where early prompt-only results repeatedly miss the core spatial read. - Define the failure threshold before the run. The default threshold is about fifty positive-only prompt tries across distinct wording axes before declaring the pose text-insufficient or moving it to a stronger-control bucket. - Run the search in batches, usually six to twelve prompts at a time. Send each prompt through `sxcp_eval_out`, wait for the image path, then analyze the batch together instead of overreacting to one render. - Keep a short axis ledger for each batch: intended wording axis, seed, source subject, best image, repeated failure mode, and words that literalized or harmed the result. - Treat a small failed batch as direction, not a conclusion. If a batch shows a repeated failure such as head height, camera height, viewer/partner elevation, or background-plane mismatch, the next batch should vary that axis directly. - Stop early only for a strong positive result that is worth repeating on a second source or seed, or for a hard technical blocker. A weak but improving result should feed the next wording batch rather than ending the category. - If the threshold run finds a repeatable partial that is materially better than baseline, accept the partial target explicitly and mirror only that generator-safe improvement. Keep the route candidate and mark the evidence as needing expansion when the full atlas target is still unsolved. ## Current Fingering Test Pattern The prior bedding-based fingering prompt is invalid as a general rule because it solved a lower-foreground artifact by adding bedroom context to an office scene. The corrected test pattern keeps the coworking location intact: - baseline: generic POV fingering/manual-contact wording from the same source case; - candidate: foreground hand first, open-thigh geometry second, visible woman face/torso third, office chair and coworking depth fourth; - anchors: black office chair seat/arms, desk edge, laptop table corners, glass partitions, repeated desk rows, plants, tall-window depth; - rejection trigger: any result that fixes contact by changing the scene family instead of improving the pose hierarchy. ## Improvement Log - `2026-06-30`: Added side-camera/result-label separation after ballsucking seed `5757575757` produced attractive low side-camera oral views while still collapsing the requested contact object onto the shaft/glans. Future scoring should record that as side-view oral evidence and keep target-contact evidence separate. - `2026-06-30`: Added generated-route validation discipline after footjob turn `183` kept large foreground soles but hid the shaft/contact that manual probes had preserved. Future provisional generator patches should render the exact final Krea prompt once after the code change; if shared route wording adds limiting positive-channel language, clean it before sending the validation prompt. - `2026-06-30`: Added a hard-pose exploration budget after ballsucking wording tests produced only eight early probes before the first weak-case note. Future hard text-only poses should use batched wording-axis search and aim for about fifty positive-only tries before concluding the pose needs stronger control. - `2026-06-30`: Added partial-acceptance discipline after ballsucking produced repeatable tongue/lips-on-testicles results that beat the shaft/glans baseline but did not fully solve mouth-wrapped contact. Future hard-pose exits should preserve repeatable progress as a provisional generator patch while keeping the remaining miss in the expansion queue. - `2026-06-30`: Added ballsucking target-object refinement after sampler seed `9797979797` repeated the `scrotal skin is the nearest mouth surface` branch on turns `288` and `293`. Score target-object ownership separately from the side-low camera family: a route can preserve face/thigh geometry while still drifting to shaft/base contact. Avoid promoting balls-first center-object wording when it creates multi-subject or body-layout artifacts. - `2026-06-30`: Added ballsucking generated-route validation after sampler seed `9898989898` repeated the patched scrotal-skin route on turns `296` and `297`. Validation can accept a provisional target-object improvement while still keeping the pose queued when the remaining miss is full mouth-wrapped testicle contact. - `2026-06-30`: Added ballsucking fresh weak-case evidence after sampler seed `5959595959` tested lip-oval, sideways mouth pocket, and chin-pelvis upward seal wording across three women. The batch preserved low-pelvis/cheek-thigh geometry in places, but every branch returned to shaft/glans collapse or generic oral contact. Do not retry those axes as generator defaults; the next search should change the target-object control strategy rather than adding more mouth-shape synonyms. - `2026-06-30`: Added ballsucking occlusion weak-case evidence after sampler seed `6060606060` tested foreground occlusion, under-scrotum tongue shelf, and hand-guided scrotum wording across three women. The generated route remained the best partial while those axes became shaft-centered or hand/shaft-dominant. Do not retry occlusion or hand-support synonyms as generator defaults; the next useful move is a different target-object strategy or stronger control. - `2026-06-30`: Added ballsucking mouth-axis mixed-case evidence after sampler seed `6161616161` tested exact mouth-sucking, single-testicle, hanging balls below shaft, side-mouth wrap, and chin-pelvis lower-mouth wording across three women. The generated-route controls stayed the best repeated partials on two subjects, side-mouth and chin-pelvis variants produced isolated useful partials, and the rest drifted back to shaft/glans contact. Record isolated partials as axis hints, but do not patch generator wording unless a branch repeats across subjects or beats the generated-route controls. - `2026-06-30`: Added ballsucking pelvis-valley weak-case evidence after sampler seed `7171717171` tested flat pelvis-valley, thigh tunnel, pubic-hair mouth-line, low-cushion chin-anchor, and pelvis-edge target-first wording across three women. The flat pelvis-valley branch repeated a strong body-plane correction on three subjects, matching the atlas viewer-flat thigh-wall read better, but it stayed shaft-centered. Score body-plane orientation and target-object contact separately; do not patch a route when it improves orientation while regressing the target. - `2026-06-30`: Stopped the ballsucking text-only loop after sampler seed `7272727272` combined `flat-valley scrotal-skin` target wording with the prior side-low route across three women. The hybrid repeated the body-plane hint on turns `368`, `374`, and `380`, but the target stayed shaft-centered, while side-low flat-valley variants only gave look hints. Preserve the current side-low scrotal-skin partial, do not patch the hybrid axes, and move future full-target work toward stronger pose/control evidence rather than more positive-prompt synonyms. - `2026-06-30`: Promoted blowjob side-profile POV after sampler seed `5858585858` produced a three-woman generated-route repeat on turns `298`, `301`, and `304`. When the current generated route repeats across multiple subjects on a fresh seed and alternate branches do not beat it cleanly, mark the route proven instead of continuing to queue it. Keep attractive side-camera-style self-body crop results as a separate look branch when they risk drifting toward external side framing. - `2026-06-29`: Added the multisource/generator-safe method after an overfit single-character coworking test produced a visually usable but invalid bedding foreground. Future A/B runs must test at least two source cases before promoting wording that is meant to become a durable guide or generator rule. - `2026-06-29`: Added generator mirroring discipline after the accepted fingering wording proved Krea2 behavior but not generator output. Future mirroring changes need a red-green regression at final Krea formatter output, not just a guide entry. - `2026-06-29`: Tightened generator-patch promotion after the fingering generated-route probe looked good but had too little image coverage. Future pose-wording generator edits need a broader seed, subject, and location matrix before production route code changes. - `2026-06-29`: Added semantic-axis discipline after source 52 fingering tests. If a candidate succeeds by changing ownership, viewpoint, location family, or role semantics, record it as a weak-case or prompt note unless that semantic change is the intended generator behavior. Do not count it as direct evidence for the original route even when the image is visually cleaner. - `2026-06-29`: Added provisional generator-patch discipline after the user clarified that leaving a category should still carry forward same-seed progress over baseline. Future category exits should patch the generator with the best generator-safe improvement, record it as `provisional_generator_patch`, and keep the catalog route as `candidate` until repeated evidence proves it. - `2026-06-29`: Applied the category-exit rule to spread/open-thigh presentation after two source subjects improved on the same sampler seed. For setup poses that are not structurally broken before rendering, prefer at least two source subjects before mirroring a provisional generator patch, and keep the observation explicit about remaining weak points such as insufficient V-frame width or outfit closure. - `2026-06-29`: Applied the same category-exit rule to blowjob top-view after two source subjects improved on sampler seed `4242424242`. When the baseline is already usable, record the improvement narrowly: name the axis that got better, keep the route candidate, and avoid overstating the finding as proven until another seed repeats it. - `2026-06-29`: Corrected blowjob top-view criteria after atlas review and a same-seed source-`46` probe showed that vertical shaft alignment alone can still render as frontal/eye-height oral. Future top-view evidence must show steep overhead camera geometry: viewer abdomen at the lower edge, camera looking down from above the viewer chest/abdomen, and the woman's hair crown, shoulders, and hands visible from above. - `2026-06-29`: Refined blowjob top-view prompt-axis search after the user rejected horizontally biased probes. Run several prompt-only probes before editing the generator, wait for `sxcp_eval_in` to advance to the new turn, and compare each image against the atlas verticality criteria. The useful axis is `nadir-angle` or `bird's-eye` plus standing male POV, nearby floor plane dominating the image, the woman directly below between the viewer's feet, and top-down office anchors. Avoid `plumb-line` and `map` in generator prompts because Krea2 can literalize them as drawn graphics. - `2026-06-29`: For quick wording-axis search, prefer a batched prompt-probe loop before analysis-heavy iteration. Prepare several positive-only alternate prompts that isolate likely wording axes, send them one at a time through `sxcp_eval_out` with the same sampler seed, pull only until each new `sxcp_eval_in` turn and image path exists, then inspect the returned images as a batch. Use the bulk comparison to pick the best axis, identify literalized or harmful words, and only then update the generator, guide, catalog, or eval log. - `2026-06-29`: Preserve prompt-order controls when testing anything beyond rough pose-axis discovery. Prompts that start with pose geometry and omit or move the subject/look block can reduce female-look adherence, so treat those runs as geometry-only probes. Durable A/B prompts should keep the original subject/look description first, then the pose hierarchy, then location and style/background anchors, unless the test is explicitly about prompt-order sensitivity. - `2026-06-29`: Added result-validation discipline to the batched prompt helper. After sending a batch, fill the result template from `sxcp_eval_in`, run `validate-results`, and only then draft evidence. The validation step proves each probe returned an ordered turn, an absolute PNG artifact, and the fixed sampler seed before bulk analysis or log-entry drafting. - `2026-06-29`: Added `run-batch` automation to the batched prompt helper. It removes manual push/pull copy-paste from normal A/B runs while keeping the same gates: positive-only prompts, fixed sampler seed, turn advancement, absolute PNG image path, and `validate-results` before evidence drafting. - `2026-06-29`: Split missionary subcases after turns `77`-`84`. Turns `76` and `80` are valid angled/cushion missionary results, not failures. The flatter atlas examples need a different positive axis: woman flat across an elevated table/platform, viewer standing or braced at the foot edge, and viewer feet, shins, or side-dropping legs placed below the support edge. Patch this only into the raised-edge/edge-supported route; keep generic missionary available for angled valid views. - `2026-06-29`: Folded-missionary tuning on seed `8989898989` used two subject-first batches before code changes. Turns `85`-`88` showed that compact knee-block and vertical-thigh-column wording can produce the folded high-leg geometry, but the shaft/contact disappears when knees and feet lead the hierarchy. Turns `89`-`92` then tested contact-first variants; turn `89` was accepted because it placed the viewer lower abdomen and large centered shaft/contact before the compact folded-knee block. This confirms the method: use the first batch to identify the failed axis, run a targeted second batch, then mirror only the accepted generator-safe hierarchy as a provisional patch. - `2026-06-29`: Frontal cowgirl on seed `8989898989` used a baseline-plus- variants batch instead of comparing against a previous category. Turn `93` was a valid generic cowgirl baseline, so turn `95`'s wide horizontal thigh bridge improvement became a prompt-guide rule rather than a generator patch. When the baseline already hits the pose, record the useful atlas refinement and leave the generator unchanged unless repeated evidence shows a systemic weakness. - `2026-06-29`: Cowgirl-alt on seed `8989898989` exposed a spatial-orientation blind spot. Turns `97`-`100` had readable contact and squat-like knees, but the background still read as a platform/high-camera setup. After rechecking the atlas, turns `101`-`104` tested flat-supine viewer wording with ceiling and upper-room cues; turn `104` was accepted. Future pose analysis must compare atlas and generated room geometry before accepting an image. - `2026-06-29`: Reverse cowgirl on seed `8989898989` showed that a correct semantic label such as `facing away` can be ignored when the visual hierarchy still resembles frontal cowgirl. Future back-facing straddle tests should score facing direction before contact quality and should name the back, hips, and ass as the nearest largest shapes before viewer-leg and contact details. Treat over-shoulder glances as secondary refinements only after the back-facing straddle is already locked. - `2026-06-29`: Reverse-cowgirl-alt on seed `8989898989` confirmed that atlas sibling folders can need separate generator routes even when the baseline is already valid. Normal reverse cowgirl is close back/hip dominant; reverse-alt is upright seated with vertical back/shoulders and viewer hands or thighs forming the lower frame. Keep those prompt hierarchies separate instead of merging all back-facing woman-on-top evidence into one route. - `2026-06-29`: Added non-target-viewpoint discipline after blowjob side-profile oral produced an attractive side-camera result on seed `5656565656`. If a render is visually useful but reads as a different camera family, record it as a weak case for a future route and do not mirror it into the current POV generator path. - `2026-06-29`: Added MCP command memory after repeated context loss around the bridge workflow. Future A/B calls should use the checked helper command `/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py ...`, with `comfy_push` to `sxcp_eval_out` for prompt-only positive conditioning and `comfy_pull` from `sxcp_eval_in` for returned prompt/image/seed payloads. - `2026-06-29`: Added side-profile oral ownership discipline after source `46` improved with explicit adult-male foreground ownership while source `47` rejected a related `body-axis` cue by transferring the body surface to the woman. Future side-profile tests should name the foreground owner repeatedly and verify that the woman's body stays lateral before considering any generator mirroring. - `2026-06-30`: Promoted the side-profile oral lateral-edge body-line axis after sampler seed `9753197531` repeated it across two visible women. Pure male-body-axis wording can expose the male as a photographed subject or let Krea2 transfer the central body surface away from the intended first-person view. Future generator patches should combine adult-male foreground ownership with explicit lateral entry from the left edge, mouth at the male abdomen line, and hand under the lips; keep the route provisional until another seed/source expansion repeats it. - `2026-06-30`: Added side-profile oral generated-route contact validation after turn `206` kept the male body-line geometry but let the mouth float above the shaft while the hand became the contact anchor. Turn `207` improved after adding lips-touching and mouth-to-shaft-contact priority. Future generated-route validation for oral side-profile should score both viewpoint ownership and which body part actually anchors the contact. - `2026-06-30`: Added the side-profile oral lower-right torso anchor after sampler seed `9595959595` repeated it on turns `279` and `283` across two visible women. The useful wording makes the adult male viewer's own torso start at the lower edge and run diagonally into the lower-right foreground, with navel, abdomen hair, pelvis, and near thigh marking the camera owner's body. Prefer this over generic body-axis wording, which can expose the male as a photographed side subject or transfer the axis onto the woman. - `2026-06-30`: Added side-profile oral generated-route validation after sampler seed `9696969696` repeated the patched route on turns `284` and `285`. Count generated-route validation separately from prompt-axis search: it proves the formatter can carry the new wording, while promotion still requires broader source/seed evidence. - `2026-06-30`: Promoted normal frontal cowgirl from guide-only to provisional generator patch after seed `2828282828` repeated the wide-thigh bridge axis across two visible women. When the baseline is already valid, a generator patch is still appropriate if a later seed repeats a narrow atlas refinement that improves geometry without harming subject/look, contact, or setting. Generated-route turn `216` validated the patched formatter route with viewer hands on outer thighs, wide foreground thigh bridge, upright torso, centered contact, and coworking depth. Keep the route candidate until another source/seed repeats the refinement. - `2026-06-29`: Applied the category-exit rule to blowjob laying frontal after source `46` and source `50` improved on sampler seed `6767676767`. When baselines are already strong, preserve the exact improved axis: wide V-frame and low-horizontal torso hierarchy, while noting residual high-hip posture and keeping the generator patch provisional until another seed repeats it. - `2026-06-29`: Applied the category-exit rule to blowjob sitting upright after source `46` and source `50` improved on sampler seed `7878787878`. When a baseline preserves the seated pose but floats the face above the contact point, prefer low-mouth seated hierarchy over generic `mouth aligned` wording: face lowered to the exact center contact point, open mouth covering the centered tip, and hands directly at the base. Record outfit looseness/drift as residual risk and keep the generator patch provisional until another seed repeats it.