Add Krea2 POV routing and eval tooling
This commit is contained in:
@@ -0,0 +1,461 @@
|
||||
# Krea2 A/B Methodology Memory
|
||||
|
||||
This file is the persistent memory for SxCP Krea2 prompt A/B methodology.
|
||||
Update it whenever the testing method improves.
|
||||
|
||||
## Current Method
|
||||
|
||||
Version: `2026-06-30-generated-route-validation-positive-channel-cleanup`
|
||||
|
||||
1. Pull or construct the baseline from an actual SxCP/CodexMCPTest source case.
|
||||
2. Keep the sampler seed fixed across the baseline and candidate.
|
||||
3. Keep subject, location family, camera family, and target pose fixed unless
|
||||
the experiment explicitly tests one of those axes.
|
||||
4. Change one prompt variable at a time when possible, usually the visual
|
||||
hierarchy for the target contact or pose.
|
||||
5. Keep `sxcp_eval_out` positive-only. Do not place negative-conditioning
|
||||
phrases in the visible prompt.
|
||||
6. Use location-compatible anchors only. For coworking/office scenes, use chair
|
||||
edge, desk edge, laptop table, glass partitions, repeated desk rows, plants,
|
||||
and window depth instead of bedroom or bedding anchors.
|
||||
7. Treat a manual prompt win as proof that Krea2 responds to the wording, not
|
||||
proof that the SxCP generator already emits it.
|
||||
8. Mirror a prompt win into the generator as a provisional improvement when
|
||||
leaving a category if same-seed evidence shows it improves over baseline and
|
||||
the wording is generator-safe. Keep the route `candidate` until the broader
|
||||
generator-patch evidence matrix proves it.
|
||||
9. When a subject-first batch preserves appearance but repeatedly misses the
|
||||
atlas body plane, record it as weak-case evidence and consider stronger
|
||||
control before adding more generator text.
|
||||
10. Score spatial orientation against the atlas before accepting evidence,
|
||||
and treat a contradictory room/background read as a rejection even when
|
||||
contact or limb placement is clear. Use background cues to decide whether
|
||||
the viewer or partner is high, low, standing, seated, supine, or on a
|
||||
support before grading pose/contact quality.
|
||||
11. For hard text-only pose families, set an exploration budget before calling
|
||||
the route weak or deciding it needs stronger control. Eight prompt probes
|
||||
are only an early signal. Use batched wording-axis probes and aim for about
|
||||
fifty positive-only tries across meaningful axes before concluding that
|
||||
prompt text cannot reliably express the pose.
|
||||
12. Do not require a perfect atlas hit before carrying progress forward. After
|
||||
the exploration budget, a repeatable partial that beats the baseline failure
|
||||
mode can become an accepted provisional generator improvement while the
|
||||
remaining miss stays documented for later seed/source expansion.
|
||||
13. After patching generator wording, render one prompt produced by the actual
|
||||
code path before closing the category. Manual prompt-axis wins are not
|
||||
enough; the generated route can still drop the key contact hierarchy or add
|
||||
limiting positive-channel wording.
|
||||
|
||||
## Promotion Gates
|
||||
|
||||
- One clean fixed-seed A/B can be recorded as evidence for that source case.
|
||||
- A prompt-guide rule needs repeated evidence across distinct subjects,
|
||||
locations, or seeds, unless the generated prompt is structurally wrong before
|
||||
rendering.
|
||||
- A catalog variant remains candidate until the rule repeats under controlled
|
||||
conditions.
|
||||
- A provisional generator patch is allowed when leaving a category if the best
|
||||
tested wording improves over baseline on a fixed seed. It should preserve the
|
||||
selected subject, outfit, location, and camera semantics, and it must not patch
|
||||
in a scene workaround that only solved one render.
|
||||
- A proven/default generator patch still needs the broader evidence matrix below,
|
||||
unless the generated prompt is structurally wrong before rendering.
|
||||
|
||||
## Generator Mirroring
|
||||
|
||||
After a manual A/B prompt win, do not assume the SxCP generator mirrors the
|
||||
wording. Add a failing regression against the final formatter output first, then
|
||||
patch the narrow route boundary that owns the wording. The regression should
|
||||
assert the accepted hierarchy terms and reject the failure mode that caused the
|
||||
bad render, such as scene-incompatible anchors or negative-conditioning text in
|
||||
the positive prompt.
|
||||
|
||||
After the route patch, run a generated-route probe through `sxcp_eval_out` with
|
||||
the same sampler seed when feasible. Use the actual formatter output, not a
|
||||
hand-normalized prompt. If the generated route regresses compared with the
|
||||
manual prompt-axis winner, record the failed generated-route image as the
|
||||
baseline, tighten the route wording, and validate again before logging the
|
||||
candidate as generated-route evidence.
|
||||
|
||||
For location-specific wins, split the implementation:
|
||||
|
||||
- the action or role graph owns the pose/contact hierarchy;
|
||||
- the final Krea formatter owns scene-compatible anchor expansion because it can
|
||||
see the selected scene, camera, and composition;
|
||||
- existing route phrases that downstream tests rely on should be preserved
|
||||
inside the stronger wording when they do not conflict with the A/B evidence.
|
||||
|
||||
## MCP Command Memory
|
||||
|
||||
Use the checked helper instead of ad hoc Python snippets for bridge calls. The
|
||||
approved command prefix is:
|
||||
|
||||
```bash
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py
|
||||
```
|
||||
|
||||
Common calls:
|
||||
|
||||
```bash
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py list-tools
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_pull --arguments-json '{"channel":"sxcp_eval_in"}'
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_push --arguments-json '{"channel":"sxcp_eval_out","seed":5656565656,"text":"PROMPT_ONLY_POSITIVE_CONDITIONING"}'
|
||||
```
|
||||
|
||||
For batched prompt-axis search, prepare a JSON batch and use the offline command
|
||||
renderer before touching the bridge manually:
|
||||
|
||||
```bash
|
||||
python tools/sxcp_prompt_batch.py validate --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py print-push-commands --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py print-result-template --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py run-batch --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --previous-turn 80 --run
|
||||
python tools/sxcp_prompt_batch.py validate-results --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json
|
||||
python tools/sxcp_prompt_batch.py print-eval-entry-draft --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --variant-key pov_example_variant --baseline-image /absolute/baseline.png --candidate-id controlled_subject_first
|
||||
```
|
||||
|
||||
Use `run-batch --run` for normal batch execution. It pushes one positive prompt,
|
||||
polls `sxcp_eval_in` until the turn advances and an absolute PNG appears with
|
||||
the fixed sampler seed, writes the filled result JSON, then sends the next
|
||||
prompt. Omit `--run` for a dry-run command preview. Run `validate-results` after
|
||||
the batch and before drafting evidence. It checks that every probe returned a
|
||||
new ordered turn, an absolute PNG image path, and the same sampler seed as the
|
||||
batch. This keeps batched prompt search as image-presence collection first and
|
||||
bulk analysis second.
|
||||
|
||||
Before drafting evidence, compare atlas references and generated images for
|
||||
spatial orientation, not only limb/contact similarity. First decide the
|
||||
atlas's surface and camera-height relationship, then check whether the
|
||||
generated background supports the same read. Use the background as a
|
||||
camera-height witness: ceiling, upper walls, and high partition lines usually
|
||||
support a low viewer looking upward; floor, carpet, table tops, platform edges,
|
||||
or furniture behind the body can reveal a higher camera, seated support, or a
|
||||
different surface. If the atlas target has the viewer flat on his back or the
|
||||
partner mounted over him, do not accept a candidate only because contact is
|
||||
clear; the room geometry must also support that flat/low read. Reject the
|
||||
candidate before generator mirroring when the background says the bodies are on
|
||||
a different surface or at a different height than the atlas.
|
||||
|
||||
`print-eval-entry-draft` rejects `geometry_only` candidates by default. Use
|
||||
`--allow-geometry-only` only when the entry is explicitly labeled as
|
||||
non-controlled prompt-axis evidence rather than subject/look-controlled A/B
|
||||
evidence.
|
||||
|
||||
Keep `sxcp_eval_out` prompt-only and positive-only. Do not use
|
||||
`sxcp_eval_negative_out` for Krea2 tuning.
|
||||
|
||||
## Generator-Patch Evidence Matrix
|
||||
|
||||
Do prompt and image exploration before editing production generator wording. A
|
||||
normal pose-wording generator patch needs all of this evidence first:
|
||||
|
||||
- at least three distinct source cases with different visible subjects;
|
||||
- at least two sampler seeds, unless the source prompt is structurally wrong
|
||||
before rendering;
|
||||
- location-family coverage when the proposed wording changes scene anchors;
|
||||
- one baseline and one candidate per source case, with subject, location family,
|
||||
camera family, and sampler seed fixed inside each pair;
|
||||
- positive-only candidate prompts, with no negative-conditioning phrases in the
|
||||
positive prompt.
|
||||
|
||||
A generated-route probe that works before the full matrix is useful evidence.
|
||||
If it is the best tested improvement when leaving the category, it can become a
|
||||
`provisional_generator_patch` with final prompt regression coverage. It should
|
||||
not become a proven `generator_patch` decision until the matrix repeats and the
|
||||
final generated prompt is regression-tested.
|
||||
|
||||
## Hard-Pose Exploration Budget
|
||||
|
||||
Use this budget for atlas poses where early prompt-only results repeatedly miss
|
||||
the core spatial read.
|
||||
|
||||
- Define the failure threshold before the run. The default threshold is about
|
||||
fifty positive-only prompt tries across distinct wording axes before declaring
|
||||
the pose text-insufficient or moving it to a stronger-control bucket.
|
||||
- Run the search in batches, usually six to twelve prompts at a time. Send each
|
||||
prompt through `sxcp_eval_out`, wait for the image path, then analyze the
|
||||
batch together instead of overreacting to one render.
|
||||
- Keep a short axis ledger for each batch: intended wording axis, seed, source
|
||||
subject, best image, repeated failure mode, and words that literalized or
|
||||
harmed the result.
|
||||
- Treat a small failed batch as direction, not a conclusion. If a batch shows a
|
||||
repeated failure such as head height, camera height, viewer/partner elevation,
|
||||
or background-plane mismatch, the next batch should vary that axis directly.
|
||||
- Stop early only for a strong positive result that is worth repeating on a
|
||||
second source or seed, or for a hard technical blocker. A weak but improving
|
||||
result should feed the next wording batch rather than ending the category.
|
||||
- If the threshold run finds a repeatable partial that is materially better
|
||||
than baseline, accept the partial target explicitly and mirror only that
|
||||
generator-safe improvement. Keep the route candidate and mark the evidence as
|
||||
needing expansion when the full atlas target is still unsolved.
|
||||
|
||||
## Current Fingering Test Pattern
|
||||
|
||||
The prior bedding-based fingering prompt is invalid as a general rule because
|
||||
it solved a lower-foreground artifact by adding bedroom context to an office
|
||||
scene. The corrected test pattern keeps the coworking location intact:
|
||||
|
||||
- baseline: generic POV fingering/manual-contact wording from the same source
|
||||
case;
|
||||
- candidate: foreground hand first, open-thigh geometry second, visible woman
|
||||
face/torso third, office chair and coworking depth fourth;
|
||||
- anchors: black office chair seat/arms, desk edge, laptop table corners, glass
|
||||
partitions, repeated desk rows, plants, tall-window depth;
|
||||
- rejection trigger: any result that fixes contact by changing the scene family
|
||||
instead of improving the pose hierarchy.
|
||||
|
||||
## Improvement Log
|
||||
|
||||
- `2026-06-30`: Added side-camera/result-label separation after ballsucking
|
||||
seed `5757575757` produced attractive low side-camera oral views while still
|
||||
collapsing the requested contact object onto the shaft/glans. Future scoring
|
||||
should record that as side-view oral evidence and keep target-contact evidence
|
||||
separate.
|
||||
- `2026-06-30`: Added generated-route validation discipline after footjob turn
|
||||
`183` kept large foreground soles but hid the shaft/contact that manual probes
|
||||
had preserved. Future provisional generator patches should render the exact
|
||||
final Krea prompt once after the code change; if shared route wording adds
|
||||
limiting positive-channel language, clean it before sending the validation
|
||||
prompt.
|
||||
- `2026-06-30`: Added a hard-pose exploration budget after ballsucking wording
|
||||
tests produced only eight early probes before the first weak-case note. Future
|
||||
hard text-only poses should use batched wording-axis search and aim for about
|
||||
fifty positive-only tries before concluding the pose needs stronger control.
|
||||
- `2026-06-30`: Added partial-acceptance discipline after ballsucking produced
|
||||
repeatable tongue/lips-on-testicles results that beat the shaft/glans
|
||||
baseline but did not fully solve mouth-wrapped contact. Future hard-pose exits
|
||||
should preserve repeatable progress as a provisional generator patch while
|
||||
keeping the remaining miss in the expansion queue.
|
||||
- `2026-06-30`: Added ballsucking target-object refinement after sampler seed
|
||||
`9797979797` repeated the `scrotal skin is the nearest mouth surface` branch
|
||||
on turns `288` and `293`. Score target-object ownership separately from the
|
||||
side-low camera family: a route can preserve face/thigh geometry while still
|
||||
drifting to shaft/base contact. Avoid promoting balls-first center-object
|
||||
wording when it creates multi-subject or body-layout artifacts.
|
||||
- `2026-06-30`: Added ballsucking generated-route validation after sampler seed
|
||||
`9898989898` repeated the patched scrotal-skin route on turns `296` and
|
||||
`297`. Validation can accept a provisional target-object improvement while
|
||||
still keeping the pose queued when the remaining miss is full mouth-wrapped
|
||||
testicle contact.
|
||||
- `2026-06-30`: Added ballsucking fresh weak-case evidence after sampler seed
|
||||
`5959595959` tested lip-oval, sideways mouth pocket, and chin-pelvis upward
|
||||
seal wording across three women. The batch preserved low-pelvis/cheek-thigh
|
||||
geometry in places, but every branch returned to shaft/glans collapse or
|
||||
generic oral contact. Do not retry those axes as generator defaults; the next
|
||||
search should change the target-object control strategy rather than adding
|
||||
more mouth-shape synonyms.
|
||||
- `2026-06-30`: Added ballsucking occlusion weak-case evidence after sampler
|
||||
seed `6060606060` tested foreground occlusion, under-scrotum tongue shelf,
|
||||
and hand-guided scrotum wording across three women. The generated route
|
||||
remained the best partial while those axes became shaft-centered or
|
||||
hand/shaft-dominant. Do not retry occlusion or hand-support synonyms as
|
||||
generator defaults; the next useful move is a different target-object strategy
|
||||
or stronger control.
|
||||
- `2026-06-30`: Added ballsucking mouth-axis mixed-case evidence after sampler
|
||||
seed `6161616161` tested exact mouth-sucking, single-testicle, hanging balls
|
||||
below shaft, side-mouth wrap, and chin-pelvis lower-mouth wording across
|
||||
three women. The generated-route controls stayed the best repeated partials
|
||||
on two subjects, side-mouth and chin-pelvis variants produced isolated useful
|
||||
partials, and the rest drifted back to shaft/glans contact. Record isolated
|
||||
partials as axis hints, but do not patch generator wording unless a branch
|
||||
repeats across subjects or beats the generated-route controls.
|
||||
- `2026-06-30`: Added ballsucking pelvis-valley weak-case evidence after
|
||||
sampler seed `7171717171` tested flat pelvis-valley, thigh tunnel,
|
||||
pubic-hair mouth-line, low-cushion chin-anchor, and pelvis-edge target-first
|
||||
wording across three women. The flat pelvis-valley branch repeated a strong
|
||||
body-plane correction on three subjects, matching the atlas viewer-flat
|
||||
thigh-wall read better, but it stayed shaft-centered. Score body-plane
|
||||
orientation and target-object contact separately; do not patch a route when
|
||||
it improves orientation while regressing the target.
|
||||
- `2026-06-30`: Stopped the ballsucking text-only loop after sampler seed
|
||||
`7272727272` combined `flat-valley scrotal-skin` target wording with the
|
||||
prior side-low route across three women. The hybrid repeated the body-plane
|
||||
hint on turns `368`, `374`, and `380`, but the target stayed shaft-centered,
|
||||
while side-low flat-valley variants only gave look hints. Preserve the
|
||||
current side-low scrotal-skin partial, do not patch the hybrid axes, and move
|
||||
future full-target work toward stronger pose/control evidence rather than
|
||||
more positive-prompt synonyms.
|
||||
- `2026-06-30`: Promoted blowjob side-profile POV after sampler seed
|
||||
`5858585858` produced a three-woman generated-route repeat on turns `298`,
|
||||
`301`, and `304`. When the current generated route repeats across multiple
|
||||
subjects on a fresh seed and alternate branches do not beat it cleanly, mark
|
||||
the route proven instead of continuing to queue it. Keep attractive
|
||||
side-camera-style self-body crop results as a separate look branch when they
|
||||
risk drifting toward external side framing.
|
||||
- `2026-06-29`: Added the multisource/generator-safe method after an overfit
|
||||
single-character coworking test produced a visually usable but invalid
|
||||
bedding foreground. Future A/B runs must test at least two source cases before
|
||||
promoting wording that is meant to become a durable guide or generator rule.
|
||||
- `2026-06-29`: Added generator mirroring discipline after the accepted
|
||||
fingering wording proved Krea2 behavior but not generator output. Future
|
||||
mirroring changes need a red-green regression at final Krea formatter output,
|
||||
not just a guide entry.
|
||||
- `2026-06-29`: Tightened generator-patch promotion after the fingering
|
||||
generated-route probe looked good but had too little image coverage. Future
|
||||
pose-wording generator edits need a broader seed, subject, and location matrix
|
||||
before production route code changes.
|
||||
- `2026-06-29`: Added semantic-axis discipline after source 52 fingering tests.
|
||||
If a candidate succeeds by changing ownership, viewpoint, location family, or
|
||||
role semantics, record it as a weak-case or prompt note unless that semantic
|
||||
change is the intended generator behavior. Do not count it as direct evidence
|
||||
for the original route even when the image is visually cleaner.
|
||||
- `2026-06-29`: Added provisional generator-patch discipline after the user
|
||||
clarified that leaving a category should still carry forward same-seed progress
|
||||
over baseline. Future category exits should patch the generator with the best
|
||||
generator-safe improvement, record it as `provisional_generator_patch`, and
|
||||
keep the catalog route as `candidate` until repeated evidence proves it.
|
||||
- `2026-06-29`: Applied the category-exit rule to spread/open-thigh presentation
|
||||
after two source subjects improved on the same sampler seed. For setup poses
|
||||
that are not structurally broken before rendering, prefer at least two source
|
||||
subjects before mirroring a provisional generator patch, and keep the
|
||||
observation explicit about remaining weak points such as insufficient V-frame
|
||||
width or outfit closure.
|
||||
- `2026-06-29`: Applied the same category-exit rule to blowjob top-view after
|
||||
two source subjects improved on sampler seed `4242424242`. When the baseline is already usable,
|
||||
record the improvement narrowly: name the axis that got better, keep the route
|
||||
candidate, and avoid overstating the finding as proven until another seed
|
||||
repeats it.
|
||||
- `2026-06-29`: Corrected blowjob top-view criteria after atlas review and a
|
||||
same-seed source-`46` probe showed that vertical shaft alignment alone can
|
||||
still render as frontal/eye-height oral. Future top-view evidence must show
|
||||
steep overhead camera geometry: viewer abdomen at the lower edge, camera
|
||||
looking down from above the viewer chest/abdomen, and the woman's hair crown,
|
||||
shoulders, and hands visible from above.
|
||||
- `2026-06-29`: Refined blowjob top-view prompt-axis search after the user
|
||||
rejected horizontally biased probes. Run several prompt-only probes before
|
||||
editing the generator, wait for `sxcp_eval_in` to advance to the new turn, and
|
||||
compare each image against the atlas verticality criteria. The useful axis is
|
||||
`nadir-angle` or `bird's-eye` plus standing male POV, nearby floor plane
|
||||
dominating the image, one woman directly below between the viewer's feet, and
|
||||
top-down office anchors. Avoid `plumb-line` and `map` in generator prompts
|
||||
because Krea2 can literalize them as drawn graphics.
|
||||
- `2026-06-29`: For quick wording-axis search, prefer a batched prompt-probe
|
||||
loop before analysis-heavy iteration. Prepare several positive-only alternate
|
||||
prompts that isolate likely wording axes, send them one at a time through
|
||||
`sxcp_eval_out` with the same sampler seed, pull only until each new
|
||||
`sxcp_eval_in` turn and image path exists, then inspect the returned images as
|
||||
a batch. Use the bulk comparison to pick the best axis, identify literalized
|
||||
or harmful words, and only then update the generator, guide, catalog, or eval
|
||||
log.
|
||||
- `2026-06-29`: Preserve prompt-order controls when testing anything beyond
|
||||
rough pose-axis discovery. Prompts that start with pose geometry and omit or
|
||||
move the subject/look block can reduce female-look adherence, so treat those
|
||||
runs as geometry-only probes. Durable A/B prompts should keep the original
|
||||
subject/look description first, then the pose hierarchy, then location and
|
||||
style/background anchors, unless the test is explicitly about prompt-order
|
||||
sensitivity.
|
||||
- `2026-06-29`: Added result-validation discipline to the batched prompt helper.
|
||||
After sending a batch, fill the result template from `sxcp_eval_in`, run
|
||||
`validate-results`, and only then draft evidence. The validation step proves
|
||||
each probe returned an ordered turn, an absolute PNG artifact, and the fixed
|
||||
sampler seed before bulk analysis or log-entry drafting.
|
||||
- `2026-06-29`: Added `run-batch` automation to the batched prompt helper. It
|
||||
removes manual push/pull copy-paste from normal A/B runs while keeping the same
|
||||
gates: positive-only prompts, fixed sampler seed, turn advancement, absolute
|
||||
PNG image path, and `validate-results` before evidence drafting.
|
||||
- `2026-06-29`: Split missionary subcases after turns `77`-`84`. Turns `76` and
|
||||
`80` are valid angled/cushion missionary results, not failures. The flatter
|
||||
atlas examples need a different positive axis: woman flat across an elevated
|
||||
table/platform, viewer standing or braced at the foot edge, and viewer feet,
|
||||
shins, or side-dropping legs placed below the support edge. Patch this only
|
||||
into the raised-edge/edge-supported route; keep generic missionary available
|
||||
for angled valid views.
|
||||
- `2026-06-29`: Folded-missionary tuning on seed `8989898989` used two
|
||||
subject-first batches before code changes. Turns `85`-`88` showed that
|
||||
compact knee-block and vertical-thigh-column wording can produce the folded
|
||||
high-leg geometry, but the shaft/contact disappears when knees and feet lead
|
||||
the hierarchy. Turns `89`-`92` then tested contact-first variants; turn `89`
|
||||
was accepted because it placed the viewer lower abdomen and large centered
|
||||
shaft/contact before the compact folded-knee block. This confirms the
|
||||
method: use the first batch to identify the failed axis, run a targeted
|
||||
second batch, then mirror only the accepted generator-safe hierarchy as a
|
||||
provisional patch.
|
||||
- `2026-06-29`: Frontal cowgirl on seed `8989898989` used a baseline-plus-
|
||||
variants batch instead of comparing against a previous category. Turn `93`
|
||||
was a valid generic cowgirl baseline, so turn `95`'s wide horizontal thigh
|
||||
bridge improvement became a prompt-guide rule rather than a generator patch.
|
||||
When the baseline already hits the pose, record the useful atlas refinement
|
||||
and leave the generator unchanged unless repeated evidence shows a systemic
|
||||
weakness.
|
||||
- `2026-06-29`: Cowgirl-alt on seed `8989898989` exposed a spatial-orientation
|
||||
blind spot. Turns `97`-`100` had readable contact and squat-like knees, but
|
||||
the background still read as a platform/high-camera setup. After rechecking
|
||||
the atlas, turns `101`-`104` tested flat-supine viewer wording with ceiling
|
||||
and upper-room cues; turn `104` was accepted. Future pose analysis must
|
||||
compare atlas and generated room geometry before accepting an image.
|
||||
- `2026-06-29`: Reverse cowgirl on seed `8989898989` showed that a correct
|
||||
semantic label such as `facing away` can be ignored when the visual hierarchy
|
||||
still resembles frontal cowgirl. Future back-facing straddle tests should
|
||||
score facing direction before contact quality and should name the back, hips,
|
||||
and ass as the nearest largest shapes before viewer-leg and contact details.
|
||||
Treat over-shoulder glances as secondary refinements only after the
|
||||
back-facing straddle is already locked.
|
||||
- `2026-06-29`: Reverse-cowgirl-alt on seed `8989898989` confirmed that atlas
|
||||
sibling folders can need separate generator routes even when the baseline is
|
||||
already valid. Normal reverse cowgirl is close back/hip dominant; reverse-alt
|
||||
is upright seated with vertical back/shoulders and viewer hands or thighs
|
||||
forming the lower frame. Keep those prompt hierarchies separate instead of
|
||||
merging all back-facing woman-on-top evidence into one route.
|
||||
- `2026-06-29`: Added non-target-viewpoint discipline after blowjob side-profile
|
||||
oral produced an attractive side-camera result on seed `5656565656`. If a
|
||||
render is visually useful but reads as a different camera family, record it as
|
||||
a weak case for a future route and do not mirror it into the current POV
|
||||
generator path.
|
||||
- `2026-06-29`: Added MCP command memory after repeated context loss around the
|
||||
bridge workflow. Future A/B calls should use the checked helper command
|
||||
`/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py ...`, with
|
||||
`comfy_push` to `sxcp_eval_out` for prompt-only positive conditioning and
|
||||
`comfy_pull` from `sxcp_eval_in` for returned prompt/image/seed payloads.
|
||||
- `2026-06-29`: Added side-profile oral ownership discipline after source `46`
|
||||
improved with explicit adult-male foreground ownership while source `47`
|
||||
rejected a related `body-axis` cue by transferring the body surface to the
|
||||
woman. Future side-profile tests should name the foreground owner repeatedly
|
||||
and verify that the woman's body stays lateral before considering any
|
||||
generator mirroring.
|
||||
- `2026-06-30`: Promoted the side-profile oral lateral-edge body-line axis
|
||||
after sampler seed `9753197531` repeated it across two visible women. Pure
|
||||
male-body-axis wording can expose the male as a photographed subject or let
|
||||
Krea2 transfer the central body surface away from the intended first-person
|
||||
view. Future generator patches should combine adult-male foreground ownership
|
||||
with explicit lateral entry from the left edge, mouth at the male abdomen
|
||||
line, and hand under the lips; keep the route provisional until another
|
||||
seed/source expansion repeats it.
|
||||
- `2026-06-30`: Added side-profile oral generated-route contact validation
|
||||
after turn `206` kept the male body-line geometry but let the mouth float
|
||||
above the shaft while the hand became the contact anchor. Turn `207` improved
|
||||
after adding lips-touching and mouth-to-shaft-contact priority. Future
|
||||
generated-route validation for oral side-profile should score both viewpoint
|
||||
ownership and which body part actually anchors the contact.
|
||||
- `2026-06-30`: Added the side-profile oral lower-right torso anchor after
|
||||
sampler seed `9595959595` repeated it on turns `279` and `283` across two
|
||||
visible women. The useful wording makes the adult male viewer's own torso
|
||||
start at the lower edge and run diagonally into the lower-right foreground,
|
||||
with navel, abdomen hair, pelvis, and near thigh marking the camera owner's
|
||||
body. Prefer this over generic body-axis wording, which can expose the male
|
||||
as a photographed side subject or transfer the axis onto the woman.
|
||||
- `2026-06-30`: Added side-profile oral generated-route validation after
|
||||
sampler seed `9696969696` repeated the patched route on turns `284` and
|
||||
`285`. Count generated-route validation separately from prompt-axis search:
|
||||
it proves the formatter can carry the new wording, while promotion still
|
||||
requires broader source/seed evidence.
|
||||
- `2026-06-30`: Promoted normal frontal cowgirl from guide-only to provisional
|
||||
generator patch after seed `2828282828` repeated the wide-thigh bridge axis
|
||||
across two visible women. When the baseline is already valid, a generator
|
||||
patch is still appropriate if a later seed repeats a narrow atlas refinement
|
||||
that improves geometry without harming subject/look, contact, or setting.
|
||||
Generated-route turn `216` validated the patched formatter route with viewer
|
||||
hands on outer thighs, wide foreground thigh bridge, upright torso, centered
|
||||
contact, and coworking depth. Keep the route candidate until another
|
||||
source/seed repeats the refinement.
|
||||
- `2026-06-29`: Applied the category-exit rule to blowjob laying frontal after
|
||||
source `46` and source `50` improved on sampler seed `6767676767`. When
|
||||
baselines are already strong, preserve the exact improved axis: wide V-frame and low-horizontal torso hierarchy, while noting residual high-hip posture and
|
||||
keeping the generator patch provisional until another seed repeats it.
|
||||
- `2026-06-29`: Applied the category-exit rule to blowjob sitting upright after
|
||||
source `46` and source `50` improved on sampler seed `7878787878`. When a
|
||||
baseline preserves the seated pose but floats the face above the contact
|
||||
point, prefer low-mouth seated hierarchy over generic `mouth aligned` wording:
|
||||
face lowered to the exact center contact point, open mouth covering the
|
||||
centered tip, and hands directly at the base. Record outfit looseness/drift as
|
||||
residual risk and keep the generator patch provisional until another seed
|
||||
repeats it.
|
||||
+1691
-1
File diff suppressed because it is too large
Load Diff
+1001
-8
File diff suppressed because it is too large
Load Diff
BIN
Binary file not shown.
|
After Width: | Height: | Size: 1.3 MiB |
BIN
Binary file not shown.
|
After Width: | Height: | Size: 1.3 MiB |
BIN
Binary file not shown.
|
After Width: | Height: | Size: 1.3 MiB |
+117
-13
@@ -5,6 +5,9 @@ ComfyUI sends a generated prompt, image, and seed to Codex, Codex analyzes the
|
||||
result, then sends back exactly one edited prompt for the next A/B test.
|
||||
Confirmed findings become either generator changes or durable prompt rules in
|
||||
[`krea2-prompt-guide.md`](krea2-prompt-guide.md).
|
||||
The active A/B testing method is recorded in
|
||||
[`krea2-ab-methodology.md`](krea2-ab-methodology.md); update that memory when
|
||||
the method improves.
|
||||
|
||||
## Channels
|
||||
|
||||
@@ -14,6 +17,76 @@ Confirmed findings become either generator changes or durable prompt rules in
|
||||
the MCP signal when supported. Do not put analysis here.
|
||||
- `sxcp_eval_log`: optional analysis/log channel.
|
||||
|
||||
## MCP Helper Command
|
||||
|
||||
Use the checked helper for bridge calls instead of ad hoc Python snippets. The
|
||||
approved command prefix is:
|
||||
|
||||
```bash
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py
|
||||
```
|
||||
|
||||
Common calls:
|
||||
|
||||
```bash
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py list-tools
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_pull --arguments-json '{"channel":"sxcp_eval_in"}'
|
||||
/media/p5/miniforge3/bin/python tools/sxcp_mcp_client.py call-tool comfy_push --arguments-json '{"channel":"sxcp_eval_out","seed":5656565656,"text":"PROMPT_ONLY_POSITIVE_CONDITIONING"}'
|
||||
```
|
||||
|
||||
## Batch Prompt Helper
|
||||
|
||||
For prompt-axis batches, prepare a local JSON file and use the offline helper to
|
||||
render the approved MCP push/pull commands and an image-presence checklist:
|
||||
|
||||
```bash
|
||||
python tools/sxcp_prompt_batch.py validate --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py print-push-commands --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py print-result-template --batch-json /tmp/sxcp-batch.json
|
||||
python tools/sxcp_prompt_batch.py run-batch --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --previous-turn 80 --run
|
||||
python tools/sxcp_prompt_batch.py validate-results --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json
|
||||
python tools/sxcp_prompt_batch.py print-eval-entry-draft --batch-json /tmp/sxcp-batch.json --result-json /tmp/sxcp-results.json --variant-key pov_example_variant --baseline-image /absolute/baseline.png --candidate-id controlled_subject_first
|
||||
```
|
||||
|
||||
Batch files use the fixed sampler seed and one positive prompt per probe:
|
||||
|
||||
```json
|
||||
{
|
||||
"seed": 8989898989,
|
||||
"channel_out": "sxcp_eval_out",
|
||||
"channel_in": "sxcp_eval_in",
|
||||
"probes": [
|
||||
{
|
||||
"id": "controlled_subject_first",
|
||||
"prompt_order": "subject_first",
|
||||
"text": "SUBJECT_LOOK_FIRST. POSE_HIERARCHY. LOCATION_ANCHORS."
|
||||
},
|
||||
{
|
||||
"id": "rough_geometry_axis",
|
||||
"prompt_order": "geometry_only",
|
||||
"text": "POSE_AXIS_ONLY_FOR_DISCOVERY."
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
`geometry_only` probes are for rough pose-axis discovery and are not durable
|
||||
subject/look-controlled A/B evidence. The helper rejects
|
||||
`sxcp_eval_negative_out`; keep batch prompts positive-only.
|
||||
|
||||
Use `run-batch --run` to push one positive prompt, poll `sxcp_eval_in` until a
|
||||
new turn and absolute PNG image path appear with the fixed sampler seed, write
|
||||
the filled result JSON, then send the next probe. Omit `--run` for a dry-run
|
||||
command preview. After a live run, run `validate-results`; it requires the
|
||||
result probe ids to match the batch order, each turn to advance in batch order,
|
||||
every image path to be an absolute PNG artifact, and every returned seed to
|
||||
match the fixed sampler seed. Then use `print-eval-entry-draft` to create a
|
||||
valid `krea2-eval-log.json` entry draft. Replace the generated summaries and
|
||||
observation with the real visual comparison before recording it with
|
||||
`tools/krea2_record_eval.py`. By default the draft command rejects
|
||||
`geometry_only` candidates; pass `--allow-geometry-only` only when deliberately
|
||||
recording non-controlled prompt-axis evidence.
|
||||
|
||||
## Manual Loop
|
||||
|
||||
Start the helper after sending a test prompt:
|
||||
@@ -30,23 +103,30 @@ Every three minutes it prints a structured request asking Codex to:
|
||||
4. Compare it to the prompt and previous edit.
|
||||
5. Push one prompt-only edit to `sxcp_eval_out`, preserving the same seed through
|
||||
the MCP signal when available.
|
||||
6. Classify the finding as prompt-only, prompt-guide rule, or generator fix.
|
||||
7. Change generator code/data only when the issue is systemic.
|
||||
8. Record the finding and update the Krea2 prompt guide when a rule is confirmed.
|
||||
6. Classify the finding as prompt-only, prompt-guide rule, provisional generator
|
||||
improvement, or proven generator fix.
|
||||
7. When leaving a category after same-seed progress over baseline, mirror the
|
||||
best generator-safe wording into the responsible generator path as
|
||||
`provisional_generator_patch`.
|
||||
8. Promote a generator change to proven only when the issue is systemic,
|
||||
repeated, or structurally wrong before rendering.
|
||||
9. Record the finding and update the Krea2 prompt guide when a rule is confirmed.
|
||||
|
||||
Runtime logs are written under `.sxcp_eval/` and ignored by git.
|
||||
|
||||
Durable fixed-seed findings that justify a guide rule, generator patch, or pose
|
||||
variant promotion are recorded in [`krea2-eval-log.json`](krea2-eval-log.json).
|
||||
Method changes belong in [`krea2-ab-methodology.md`](krea2-ab-methodology.md).
|
||||
Use runtime logs for scratch notes; use the JSON log only for evidence that
|
||||
should remain tied to a catalog variant. Image paths in that log point at
|
||||
external ComfyUI artifacts and may be cleaned; the durable evidence is the fixed
|
||||
seed, prompt summaries, observation, decision, and commit.
|
||||
sampler seed, optional generator seed, prompt summaries, observation, decision,
|
||||
and commit.
|
||||
|
||||
Record durable findings with the checked helper instead of hand-editing the log:
|
||||
|
||||
```bash
|
||||
python tools/krea2_record_eval.py --print-template --variant-key pov_footjob_frontal_sole_stroke --seed 1234 > /tmp/krea2-entry.json
|
||||
python tools/krea2_record_eval.py --print-template --variant-key pov_footjob_frontal_sole_stroke --seed 1234 --generator-seed 5678 > /tmp/krea2-entry.json
|
||||
python tools/krea2_record_eval.py --entry-json /tmp/krea2-entry.json --dry-run
|
||||
python tools/krea2_record_eval.py --entry-json /tmp/krea2-entry.json
|
||||
```
|
||||
@@ -59,6 +139,7 @@ Entry template:
|
||||
"date": "2026-06-29",
|
||||
"variant_key": "pov_example_variant",
|
||||
"seed": 1234,
|
||||
"generator_seed": 5678,
|
||||
"source": "sxcp_eval_mcp",
|
||||
"result": "accepted",
|
||||
"decision": "generator_patch",
|
||||
@@ -141,22 +222,45 @@ pelvis can be correct. Do not treat them as automatic failures.
|
||||
|
||||
## Seed Contract
|
||||
|
||||
The seed is transport metadata, not prompt text. When the graph emits a seed, an
|
||||
A/B wording test should reuse that exact seed so the image difference mostly
|
||||
comes from wording, not sampling randomness. If a payload has no seed, mark that
|
||||
The sampler seed is transport metadata, not prompt text. When the graph emits a
|
||||
sampler seed, an A/B wording test should reuse that exact seed so the image
|
||||
difference mostly comes from wording, not sampling randomness. If the SxCP
|
||||
generator/control seed differs from the sampler seed, record it as
|
||||
`generator_seed` in the eval entry. If a payload has no sampler seed, mark that
|
||||
cycle as uncontrolled and avoid turning the result into a durable generator rule
|
||||
without another controlled run.
|
||||
|
||||
## Positive-Only Conditioning
|
||||
|
||||
`sxcp_eval_out` is positive conditioning only. Never send negative-conditioning
|
||||
phrases such as `no shaft`, `no hands`, `without clothing`, or `avoid X` inside
|
||||
the positive prompt; distilled Krea2 can reinforce or hallucinate the unwanted
|
||||
object from that wording.
|
||||
|
||||
This loop has no active negative-output channel. A same-positive, same-seed
|
||||
probe on seed `424242` compared empty negative conditioning against strong
|
||||
negative text targeting visible prompt attributes, and the rendered image stayed
|
||||
visually unchanged. Do not rely on negative conditioning for Krea2 pose tuning;
|
||||
keep prompt fixes positive-only.
|
||||
|
||||
## Generator Fix Rule
|
||||
|
||||
Only edit the generator when the image shows a repeatable, systemic prompt
|
||||
failure. Examples:
|
||||
Use two levels of generator change:
|
||||
|
||||
- `provisional_generator_patch`: apply the best generator-safe wording when
|
||||
leaving a category after fixed-seed progress over baseline. Keep the catalog
|
||||
variant as `candidate`.
|
||||
- `generator_patch`: promote as a proven/default generator rule when the issue
|
||||
is repeated, systemic, or structurally wrong before rendering.
|
||||
|
||||
Examples of proven generator fixes:
|
||||
|
||||
- Selfie wording overrides orbit camera.
|
||||
- Clothing continuity loses the selected softcore outfit.
|
||||
- POV wording makes the off-camera participant the visual subject.
|
||||
- Location camera layout inserts foreground anchors in the wrong place.
|
||||
|
||||
For one-off model drift, send a cleaner prompt to `sxcp_eval_out` and keep the
|
||||
generator unchanged. For repeated prompt behavior, update the generator and add
|
||||
the rule to `docs/krea2-prompt-guide.md`.
|
||||
For one-off model drift inside an active category, send a cleaner prompt to
|
||||
`sxcp_eval_out` and keep collecting evidence. When exiting a category, carry
|
||||
forward same-seed improvements over baseline as provisional generator changes
|
||||
and add the rule or weak case to `docs/krea2-prompt-guide.md`.
|
||||
|
||||
Reference in New Issue
Block a user