ComfyUI-Ethanfel-Prompt-Bui…/docs/krea2-prompt-guide.md

# Krea2 Prompt Guide

This document records prompt rules discovered from actual SxCP generator
outputs tested in Krea2. It is not a generic prompt cookbook. Add a rule only
when an A/B image comparison shows that the wording improves or breaks Krea2
behavior.

## Core Rule

Krea2 responds best when the prompt gives one clear visual hierarchy:

1. subject/cast descriptor,
2. action or pose,
3. clothing state,
4. location,
5. camera/layout,
6. expression,
7. composition/crop,
8. style.

Avoid letting two sections describe incompatible camera or framing intents.

## Prompt Output Contract

- `sxcp_eval_out` must contain only the prompt being tested.
- Analysis, scoring, and generator notes belong in chat or `sxcp_eval_log`.
- Keep one experiment variable per cycle when possible.
- Lock seed, character, location, and camera when testing wording changes.
- Treat the MCP seed as transport metadata. Preserve it for prompt-only A/B tests
  and do not write it into the visible prompt text.

## Seed-Controlled A/B Tests

Use one fixed seed when deciding whether prompt wording helped Krea2. A single
image can justify a prompt-only retry when the mismatch is obvious, but a
generator rule needs either repeated evidence or a generated prompt that is
structurally wrong before rendering.

When a workflow batches soft/hard prompts through an index switch, sidecar text
files may not be the exact prompt used for each rendered image. If the sidecar
and image disagree, inspect the PNG workflow metadata and the final text encode
input before patching the generator.

When reviewing an eval payload, log:

- emitted seed,
- original generated prompt,
- edited prompt,
- image failure or improvement,
- whether the change should stay prompt-only or become a generator patch.

## Camera And Composition

### Orbit / Multiangle Camera

When Krea2 receives an orbit or multiangle camera, avoid selfie-specific wording
unless the intended camera is actually a handheld or mirror selfie.

Works better:

- `lifestyle portrait frame`
- `creator portrait frame`
- `outfit-check pose`
- `wide environmental coworking camera layout`
- `camera placed several meters away`
- `full seated body from head to knees`
- `room depth surrounding the subject`

Conflicting wording:

- `selfie frame`
- `phone selfie`
- `holding the phone`
- `creator-shot phone photo`
- `handheld camera realism`

Observed result: selfie words pulled a back-right elevated wide shot into an
arm-length selfie. Removing selfie terms made the image follow the rear-quarter
view much better.

### Wide Shots

Krea2 tends to keep attractive subjects large in frame. To get a real wide or
environmental frame, be explicit about distance and visible environment.

Useful phrasing:

- `camera placed several meters away across the desk aisle`
- `full seated body from head to knees remains visible`
- `nearby desk edge, laptop corner, repeated desk rows, and tall-window depth clearly readable`
- `wide environmental room framing`

Avoid relying on `wide shot` alone.

## Location Layout

Location-aware camera text works when it describes the room around the subject
without stealing the foreground from the subject.

For coworking lounge:

- Keep `warm desks`, `laptop tables`, `glass partition seams`, `repeated desk rows`,
  `plants`, and `tall windows`.
- Mention foreground anchors only when the camera should actually see them.
- In POV, keep location anchors beside or behind the bodies, not in the lower
  foreground.

## Clothing Continuity

When a softcore outfit is reused in a later branch, name what happens to actual
outfit pieces instead of using generic fabric language.

Works better:

- `denim shorts are pulled aside or removed below the hips`
- `button-down shirt tied at the waist and fitted bralette remain visible from the same outfit`

Avoid generic fallback wording:

- `fabric slipping off`
- `partly exposed`
- `outfit pushed aside where needed`

Use generic wording only when no source outfit exists.

## POV Outercourse

### Boobjob / Titjob

The atlas examples are frontal and upright: the visible partner faces the viewer,
kneels between the viewer's thighs, and compresses the shaft between the breasts.
Forward-bent wording can still place the body correctly, but it weakens the
breast contact.

Works better:

- `POV boobjob position`
- `woman kneels upright between his legs facing him`
- `penis rises vertically in the lower foreground`
- `squeezed between her pressed-together breasts`
- `woman's own fingers and nails cup her breasts from the outside`
- `glans emerging above the cleavage directly below her mouth`

Avoid vague or conflicting wording:

- `torso bent forward over his pelvis`
- `both hands push her breasts` without naming whose hands
- `only foreground hands` when the intended hands are the woman's hands

## POV

In POV prompts, the visible subject should still be established first. The POV
participant is the camera viewpoint, not a normal visible cast member.

Works better:

- visible subject descriptor first,
- then POV action,
- then foreground hands/body/clothing cues.

For POV clothing, describe only visible body/clothing fragments:

- `foreground hands, hips, thighs, or lowered waistband`
- `foreground hands, forearms, sleeves, or torso edge`

Avoid:

- full third-person `Man A wears...` phrasing for the POV participant,
- making `the viewer` the first subject before the visible character is
  established.

For POV climax wording, the fluid target must follow the pose before expression
tokens. Rear-entry, doggy, bent-over, face-down, and on-all-fours poses should
target the ass, thighs, and lower back even if the expression detail mentions
face, lips, mouth, or tongue.

Evidence:

- Dataset seed `52` generated an internally contradictory prompt: on-all-fours
  rear-view positioning paired with a face/chest ejaculation target.
- Corrected seed `52` and follow-up seed `5202` both rendered the rear-view
  target consistently when the wording used `across her ass, thighs, and lower
  back` and kept the clothing state tied to the lower garment.

### POV Doggy / Rear-Entry

For doggy-style POV, visible viewer thighs, lower torso, or pelvis can be
correct. Real POV references often show them. The useful target is not removing
the viewer body, but making the body cues read as a standing or crouched
first-person viewpoint instead of a vague seated pose.

To push the reference closer to a standing or crouched man looking down, use a
top-down rear-entry structure:

- `top-down standing POV doggy position from behind`
- `camera looks down over the viewer's extended hands onto the woman's raised hips`
- `woman is on all fours with chest low, forearms folded, cheek turned sideways`
- `rear-entry penetration visible between raised hips`
- `face and mouth remain far ahead, clearly separated from the penis`

Do not use visible shoes or lower legs as the standing cue. Seed `65` showed
that adding shoes/lower legs made Krea2 drift into oral contact and lose the
rear-entry geometry.

Do not over-prompt `viewer torso and thighs outside frame`; seeds `65` and
`6602` showed Krea2 still draws lower-body POV cues, and real references support
that. Prefer framing them as plausible foreground body cues rather than trying
to suppress them.

## Style

Style should describe rendering, not camera mechanics.

Use style presets to choose between:

- natural photo,
- creator/social-media photo,
- documentary/direct-flash photo,
- cinematic realism,
- illustration/comic.

If a controlled camera is active, avoid style suffixes that imply a conflicting
camera such as `phone photo` or `handheld selfie`.

## Guide Update Format

When adding a new rule, include:

- observed prompt,
- observed image failure,
- edited prompt wording,
- image improvement or regression,
- generator path if known,
- final rule.