ComfyUI-Ethanfel-Prompt-Bui…/docs/krea2-prompt-guide.md

# Krea2 Prompt Guide

This document records prompt rules discovered from actual SxCP generator
outputs tested in Krea2. It is not a generic prompt cookbook. Add a rule only
when an A/B image comparison shows that the wording improves or breaks Krea2
behavior.

## Core Rule

Krea2 responds best when the prompt gives one clear visual hierarchy:

1. subject/cast descriptor,
2. action or pose,
3. clothing state,
4. location,
5. camera/layout,
6. expression,
7. composition/crop,
8. style.

Avoid letting two sections describe incompatible camera or framing intents.

## Prompt Output Contract

- `sxcp_eval_out` must contain only the prompt being tested.
- Analysis, scoring, and generator notes belong in chat or `sxcp_eval_log`.
- Keep one experiment variable per cycle when possible.
- Lock seed, character, location, and camera when testing wording changes.
- Treat the MCP seed as transport metadata. Preserve it for prompt-only A/B tests
  and do not write it into the visible prompt text.

## Seed-Controlled A/B Tests

Use one fixed seed when deciding whether prompt wording helped Krea2. A single
image can justify a prompt-only retry when the mismatch is obvious, but a
generator rule needs either repeated evidence or a generated prompt that is
structurally wrong before rendering.

When reviewing an eval payload, log:

- emitted seed,
- original generated prompt,
- edited prompt,
- image failure or improvement,
- whether the change should stay prompt-only or become a generator patch.

## Camera And Composition

### Orbit / Multiangle Camera

When Krea2 receives an orbit or multiangle camera, avoid selfie-specific wording
unless the intended camera is actually a handheld or mirror selfie.

Works better:

- `lifestyle portrait frame`
- `creator portrait frame`
- `outfit-check pose`
- `wide environmental coworking camera layout`
- `camera placed several meters away`
- `full seated body from head to knees`
- `room depth surrounding the subject`

Conflicting wording:

- `selfie frame`
- `phone selfie`
- `holding the phone`
- `creator-shot phone photo`
- `handheld camera realism`

Observed result: selfie words pulled a back-right elevated wide shot into an
arm-length selfie. Removing selfie terms made the image follow the rear-quarter
view much better.

### Wide Shots

Krea2 tends to keep attractive subjects large in frame. To get a real wide or
environmental frame, be explicit about distance and visible environment.

Useful phrasing:

- `camera placed several meters away across the desk aisle`
- `full seated body from head to knees remains visible`
- `nearby desk edge, laptop corner, repeated desk rows, and tall-window depth clearly readable`
- `wide environmental room framing`

Avoid relying on `wide shot` alone.

## Location Layout

Location-aware camera text works when it describes the room around the subject
without stealing the foreground from the subject.

For coworking lounge:

- Keep `warm desks`, `laptop tables`, `glass partition seams`, `repeated desk rows`,
  `plants`, and `tall windows`.
- Mention foreground anchors only when the camera should actually see them.
- In POV, keep location anchors beside or behind the bodies, not in the lower
  foreground.

## Clothing Continuity

When a softcore outfit is reused in a later branch, name what happens to actual
outfit pieces instead of using generic fabric language.

Works better:

- `denim shorts are pulled aside or removed below the hips`
- `button-down shirt tied at the waist and fitted bralette remain visible from the same outfit`

Avoid generic fallback wording:

- `fabric slipping off`
- `partly exposed`
- `outfit pushed aside where needed`

Use generic wording only when no source outfit exists.

## POV

In POV prompts, the visible subject should still be established first. The POV
participant is the camera viewpoint, not a normal visible cast member.

Works better:

- visible subject descriptor first,
- then POV action,
- then foreground hands/body/clothing cues.

For POV clothing, describe only visible body/clothing fragments:

- `foreground hands, hips, thighs, or lowered waistband`
- `foreground hands, forearms, sleeves, or torso edge`

Avoid:

- full third-person `Man A wears...` phrasing for the POV participant,
- making `the viewer` the first subject before the visible character is
  established.

## Style

Style should describe rendering, not camera mechanics.

Use style presets to choose between:

- natural photo,
- creator/social-media photo,
- documentary/direct-flash photo,
- cinematic realism,
- illustration/comic.

If a controlled camera is active, avoid style suffixes that imply a conflicting
camera such as `phone photo` or `handheld selfie`.

## Guide Update Format

When adding a new rule, include:

- observed prompt,
- observed image failure,
- edited prompt wording,
- image improvement or regression,
- generator path if known,
- final rule.