Add Krea2 evaluation loop
This commit is contained in:
@@ -2,3 +2,4 @@ __pycache__/
|
|||||||
*.py[cod]
|
*.py[cod]
|
||||||
.pytest_cache/
|
.pytest_cache/
|
||||||
.ruff_cache/
|
.ruff_cache/
|
||||||
|
.sxcp_eval/
|
||||||
|
|||||||
@@ -0,0 +1,150 @@
|
|||||||
|
# Krea2 Prompt Guide
|
||||||
|
|
||||||
|
This document records prompt rules discovered from actual SxCP generator
|
||||||
|
outputs tested in Krea2. It is not a generic prompt cookbook. Add a rule only
|
||||||
|
when an A/B image comparison shows that the wording improves or breaks Krea2
|
||||||
|
behavior.
|
||||||
|
|
||||||
|
## Core Rule
|
||||||
|
|
||||||
|
Krea2 responds best when the prompt gives one clear visual hierarchy:
|
||||||
|
|
||||||
|
1. subject/cast descriptor,
|
||||||
|
2. action or pose,
|
||||||
|
3. clothing state,
|
||||||
|
4. location,
|
||||||
|
5. camera/layout,
|
||||||
|
6. expression,
|
||||||
|
7. composition/crop,
|
||||||
|
8. style.
|
||||||
|
|
||||||
|
Avoid letting two sections describe incompatible camera or framing intents.
|
||||||
|
|
||||||
|
## Prompt Output Contract
|
||||||
|
|
||||||
|
- `sxcp_eval_out` must contain only the prompt being tested.
|
||||||
|
- Analysis, scoring, and generator notes belong in chat or `sxcp_eval_log`.
|
||||||
|
- Keep one experiment variable per cycle when possible.
|
||||||
|
- Lock seed, character, location, and camera when testing wording changes.
|
||||||
|
|
||||||
|
## Camera And Composition
|
||||||
|
|
||||||
|
### Orbit / Multiangle Camera
|
||||||
|
|
||||||
|
When Krea2 receives an orbit or multiangle camera, avoid selfie-specific wording
|
||||||
|
unless the intended camera is actually a handheld or mirror selfie.
|
||||||
|
|
||||||
|
Works better:
|
||||||
|
|
||||||
|
- `lifestyle portrait frame`
|
||||||
|
- `creator portrait frame`
|
||||||
|
- `outfit-check pose`
|
||||||
|
- `wide environmental coworking camera layout`
|
||||||
|
- `camera placed several meters away`
|
||||||
|
- `full seated body from head to knees`
|
||||||
|
- `room depth surrounding the subject`
|
||||||
|
|
||||||
|
Conflicting wording:
|
||||||
|
|
||||||
|
- `selfie frame`
|
||||||
|
- `phone selfie`
|
||||||
|
- `holding the phone`
|
||||||
|
- `creator-shot phone photo`
|
||||||
|
- `handheld camera realism`
|
||||||
|
|
||||||
|
Observed result: selfie words pulled a back-right elevated wide shot into an
|
||||||
|
arm-length selfie. Removing selfie terms made the image follow the rear-quarter
|
||||||
|
view much better.
|
||||||
|
|
||||||
|
### Wide Shots
|
||||||
|
|
||||||
|
Krea2 tends to keep attractive subjects large in frame. To get a real wide or
|
||||||
|
environmental frame, be explicit about distance and visible environment.
|
||||||
|
|
||||||
|
Useful phrasing:
|
||||||
|
|
||||||
|
- `camera placed several meters away across the desk aisle`
|
||||||
|
- `full seated body from head to knees remains visible`
|
||||||
|
- `nearby desk edge, laptop corner, repeated desk rows, and tall-window depth clearly readable`
|
||||||
|
- `wide environmental room framing`
|
||||||
|
|
||||||
|
Avoid relying on `wide shot` alone.
|
||||||
|
|
||||||
|
## Location Layout
|
||||||
|
|
||||||
|
Location-aware camera text works when it describes the room around the subject
|
||||||
|
without stealing the foreground from the subject.
|
||||||
|
|
||||||
|
For coworking lounge:
|
||||||
|
|
||||||
|
- Keep `warm desks`, `laptop tables`, `glass partition seams`, `repeated desk rows`,
|
||||||
|
`plants`, and `tall windows`.
|
||||||
|
- Mention foreground anchors only when the camera should actually see them.
|
||||||
|
- In POV, keep location anchors beside or behind the bodies, not in the lower
|
||||||
|
foreground.
|
||||||
|
|
||||||
|
## Clothing Continuity
|
||||||
|
|
||||||
|
When a softcore outfit is reused in a later branch, name what happens to actual
|
||||||
|
outfit pieces instead of using generic fabric language.
|
||||||
|
|
||||||
|
Works better:
|
||||||
|
|
||||||
|
- `denim shorts are pulled aside or removed below the hips`
|
||||||
|
- `button-down shirt tied at the waist and fitted bralette remain visible from the same outfit`
|
||||||
|
|
||||||
|
Avoid generic fallback wording:
|
||||||
|
|
||||||
|
- `fabric slipping off`
|
||||||
|
- `partly exposed`
|
||||||
|
- `outfit pushed aside where needed`
|
||||||
|
|
||||||
|
Use generic wording only when no source outfit exists.
|
||||||
|
|
||||||
|
## POV
|
||||||
|
|
||||||
|
In POV prompts, the visible subject should still be established first. The POV
|
||||||
|
participant is the camera viewpoint, not a normal visible cast member.
|
||||||
|
|
||||||
|
Works better:
|
||||||
|
|
||||||
|
- visible subject descriptor first,
|
||||||
|
- then POV action,
|
||||||
|
- then foreground hands/body/clothing cues.
|
||||||
|
|
||||||
|
For POV clothing, describe only visible body/clothing fragments:
|
||||||
|
|
||||||
|
- `foreground hands, hips, thighs, or lowered waistband`
|
||||||
|
- `foreground hands, forearms, sleeves, or torso edge`
|
||||||
|
|
||||||
|
Avoid:
|
||||||
|
|
||||||
|
- full third-person `Man A wears...` phrasing for the POV participant,
|
||||||
|
- making `the viewer` the first subject before the visible character is
|
||||||
|
established.
|
||||||
|
|
||||||
|
## Style
|
||||||
|
|
||||||
|
Style should describe rendering, not camera mechanics.
|
||||||
|
|
||||||
|
Use style presets to choose between:
|
||||||
|
|
||||||
|
- natural photo,
|
||||||
|
- creator/social-media photo,
|
||||||
|
- documentary/direct-flash photo,
|
||||||
|
- cinematic realism,
|
||||||
|
- illustration/comic.
|
||||||
|
|
||||||
|
If a controlled camera is active, avoid style suffixes that imply a conflicting
|
||||||
|
camera such as `phone photo` or `handheld selfie`.
|
||||||
|
|
||||||
|
## Guide Update Format
|
||||||
|
|
||||||
|
When adding a new rule, include:
|
||||||
|
|
||||||
|
- observed prompt,
|
||||||
|
- observed image failure,
|
||||||
|
- edited prompt wording,
|
||||||
|
- image improvement or regression,
|
||||||
|
- generator path if known,
|
||||||
|
- final rule.
|
||||||
@@ -0,0 +1,77 @@
|
|||||||
|
# SxCP Eval Loop
|
||||||
|
|
||||||
|
This loop is for tuning the SxCP generator toward stronger Krea2 images.
|
||||||
|
ComfyUI sends a generated prompt and image to Codex, Codex analyzes the result,
|
||||||
|
then sends back exactly one edited prompt for the next A/B test. Confirmed
|
||||||
|
findings become either generator changes or durable prompt rules in
|
||||||
|
[`krea2-prompt-guide.md`](krea2-prompt-guide.md).
|
||||||
|
|
||||||
|
## Channels
|
||||||
|
|
||||||
|
- `sxcp_eval_in`: ComfyUI to Codex. Contains the prompt text and image path.
|
||||||
|
- `sxcp_eval_out`: Codex to ComfyUI. Prompt-only. Do not put analysis here.
|
||||||
|
- `sxcp_eval_log`: optional analysis/log channel.
|
||||||
|
|
||||||
|
## Manual Loop
|
||||||
|
|
||||||
|
Start the helper after sending a test prompt:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
tools/sxcp_eval_loop.sh 3
|
||||||
|
```
|
||||||
|
|
||||||
|
Every three minutes it prints a structured request asking Codex to:
|
||||||
|
|
||||||
|
1. Pull `sxcp_eval_in`.
|
||||||
|
2. Inspect the image.
|
||||||
|
3. Compare it to the prompt and previous edit.
|
||||||
|
4. Push one prompt-only edit to `sxcp_eval_out`.
|
||||||
|
5. Classify the finding as prompt-only, prompt-guide rule, or generator fix.
|
||||||
|
6. Change generator code/data only when the issue is systemic.
|
||||||
|
7. Record the finding and update the Krea2 prompt guide when a rule is confirmed.
|
||||||
|
|
||||||
|
Runtime logs are written under `.sxcp_eval/` and ignored by git.
|
||||||
|
|
||||||
|
## Optional Command Hook
|
||||||
|
|
||||||
|
If you have a one-shot Codex command you want to run automatically, set:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
SXCP_EVAL_CODEX_CMD="codex exec" tools/sxcp_eval_loop.sh 3
|
||||||
|
```
|
||||||
|
|
||||||
|
The request is sent on stdin. The command also receives:
|
||||||
|
|
||||||
|
- `SXCP_EVAL_IN_CHANNEL`
|
||||||
|
- `SXCP_EVAL_OUT_CHANNEL`
|
||||||
|
- `SXCP_EVAL_LOG_CHANNEL`
|
||||||
|
- `SXCP_EVAL_GUIDE_FILE`
|
||||||
|
- `SXCP_EVAL_REQUEST_FILE`
|
||||||
|
- `SXCP_EVAL_CYCLE_DIR`
|
||||||
|
- `SXCP_EVAL_CYCLE`
|
||||||
|
|
||||||
|
## Evaluation Axes
|
||||||
|
|
||||||
|
- Identity consistency
|
||||||
|
- Outfit continuity
|
||||||
|
- Pose/action accuracy
|
||||||
|
- Camera compliance
|
||||||
|
- Location coherence
|
||||||
|
- Crop/framing
|
||||||
|
- Prompt noise/repetition
|
||||||
|
- Model confusion tokens
|
||||||
|
- Overall Krea2 image usefulness
|
||||||
|
|
||||||
|
## Generator Fix Rule
|
||||||
|
|
||||||
|
Only edit the generator when the image shows a repeatable, systemic prompt
|
||||||
|
failure. Examples:
|
||||||
|
|
||||||
|
- Selfie wording overrides orbit camera.
|
||||||
|
- Clothing continuity loses the selected softcore outfit.
|
||||||
|
- POV wording makes the off-camera participant the visual subject.
|
||||||
|
- Location camera layout inserts foreground anchors in the wrong place.
|
||||||
|
|
||||||
|
For one-off model drift, send a cleaner prompt to `sxcp_eval_out` and keep the
|
||||||
|
generator unchanged. For repeated prompt behavior, update the generator and add
|
||||||
|
the rule to `docs/krea2-prompt-guide.md`.
|
||||||
Executable
+228
@@ -0,0 +1,228 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
cat <<'EOF'
|
||||||
|
Usage:
|
||||||
|
tools/sxcp_eval_loop.sh [minutes] [options]
|
||||||
|
|
||||||
|
Loop protocol for Krea2 prompt-generator tuning. Start it right after sending a
|
||||||
|
prompt to sxcp_eval_out. Every N minutes it writes a structured evaluation
|
||||||
|
request, prints it, and optionally pipes it to a command. Each cycle should
|
||||||
|
produce either a prompt-only A/B edit, a generator fix, or a prompt-guide rule.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
-m, --minutes N Wait N minutes between evaluation requests.
|
||||||
|
-i, --in CHANNEL Graph-to-agent channel. Default: sxcp_eval_in.
|
||||||
|
-o, --out CHANNEL Agent-to-graph prompt-only channel. Default: sxcp_eval_out.
|
||||||
|
-l, --log CHANNEL Analysis/log channel name. Default: sxcp_eval_log.
|
||||||
|
-g, --guide FILE Durable Krea2 prompt guide. Default: docs/krea2-prompt-guide.md.
|
||||||
|
-d, --dir DIR Runtime log directory. Default: .sxcp_eval.
|
||||||
|
--once Run one wait/check cycle and exit.
|
||||||
|
-h, --help Show this help.
|
||||||
|
|
||||||
|
Optional automation:
|
||||||
|
SXCP_EVAL_CODEX_CMD If set, the request is piped to this command.
|
||||||
|
Example: SXCP_EVAL_CODEX_CMD="codex exec"
|
||||||
|
|
||||||
|
The command receives the request on stdin and these environment variables:
|
||||||
|
SXCP_EVAL_IN_CHANNEL, SXCP_EVAL_OUT_CHANNEL, SXCP_EVAL_LOG_CHANNEL,
|
||||||
|
SXCP_EVAL_GUIDE_FILE, SXCP_EVAL_REQUEST_FILE, SXCP_EVAL_CYCLE_DIR,
|
||||||
|
SXCP_EVAL_CYCLE.
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
die() {
|
||||||
|
echo "sxcp_eval_loop: $*" >&2
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
|
is_positive_number() {
|
||||||
|
case "${1:-}" in
|
||||||
|
''|*[!0-9.]*|.*.*|0|0.0|0.00) return 1 ;;
|
||||||
|
*) return 0 ;;
|
||||||
|
esac
|
||||||
|
}
|
||||||
|
|
||||||
|
minutes="${SXCP_EVAL_MINUTES:-}"
|
||||||
|
in_channel="${SXCP_EVAL_IN_CHANNEL:-sxcp_eval_in}"
|
||||||
|
out_channel="${SXCP_EVAL_OUT_CHANNEL:-sxcp_eval_out}"
|
||||||
|
log_channel="${SXCP_EVAL_LOG_CHANNEL:-sxcp_eval_log}"
|
||||||
|
guide_file="${SXCP_EVAL_GUIDE_FILE:-docs/krea2-prompt-guide.md}"
|
||||||
|
log_root="${SXCP_EVAL_LOG_DIR:-.sxcp_eval}"
|
||||||
|
run_once=0
|
||||||
|
|
||||||
|
if [ "${1:-}" != "" ] && [ "${1#-}" = "$1" ]; then
|
||||||
|
minutes="$1"
|
||||||
|
shift
|
||||||
|
fi
|
||||||
|
|
||||||
|
while [ "$#" -gt 0 ]; do
|
||||||
|
case "$1" in
|
||||||
|
-m|--minutes)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
minutes="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
-i|--in)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
in_channel="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
-o|--out)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
out_channel="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
-l|--log)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
log_channel="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
-g|--guide)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
guide_file="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
-d|--dir)
|
||||||
|
[ "$#" -ge 2 ] || die "$1 requires a value"
|
||||||
|
log_root="$2"
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
--once)
|
||||||
|
run_once=1
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
-h|--help)
|
||||||
|
usage
|
||||||
|
exit 0
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
die "unknown argument: $1"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
minutes="${minutes:-5}"
|
||||||
|
is_positive_number "$minutes" || die "minutes must be a positive number"
|
||||||
|
|
||||||
|
mkdir -p "$log_root"
|
||||||
|
run_id="$(date -u +%Y%m%dT%H%M%SZ)"
|
||||||
|
run_dir="$log_root/$run_id"
|
||||||
|
mkdir -p "$run_dir"
|
||||||
|
events_file="$run_dir/events.tsv"
|
||||||
|
summary_file="$run_dir/summary.md"
|
||||||
|
|
||||||
|
cat > "$summary_file" <<EOF
|
||||||
|
# SxCP Eval Loop $run_id
|
||||||
|
|
||||||
|
- Interval: ${minutes} minute(s)
|
||||||
|
- Input channel: \`$in_channel\`
|
||||||
|
- Prompt output channel: \`$out_channel\`
|
||||||
|
- Log channel: \`$log_channel\`
|
||||||
|
- Krea2 prompt guide: \`$guide_file\`
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Tune the SxCP generator so its default Krea2 prompts produce the strongest
|
||||||
|
possible images for the selected scene, camera, subject, outfit, action, and
|
||||||
|
style. Every cycle should turn visual evidence into one of:
|
||||||
|
|
||||||
|
- a prompt-only A/B edit,
|
||||||
|
- a durable rule for \`$guide_file\`,
|
||||||
|
- a generator code/data change with focused test coverage.
|
||||||
|
|
||||||
|
## Protocol
|
||||||
|
|
||||||
|
1. Pull the latest prompt/image from \`$in_channel\`.
|
||||||
|
2. Compare the image against the prompt and previous edited prompt.
|
||||||
|
3. Identify concrete Krea2 mismatches and likely generator path.
|
||||||
|
4. Classify the next step: prompt-only edit, guide rule, or generator patch.
|
||||||
|
5. Push only the next test prompt to \`$out_channel\`.
|
||||||
|
6. Keep analysis in chat or \`$log_channel\`, not in \`$out_channel\`.
|
||||||
|
7. Edit generator code/data only when the issue is systemic.
|
||||||
|
8. Update \`$guide_file\` when a wording rule is confirmed.
|
||||||
|
9. Run focused smoke tests after generator edits.
|
||||||
|
|
||||||
|
## Cycles
|
||||||
|
|
||||||
|
EOF
|
||||||
|
|
||||||
|
printf 'cycle\tutc_time\trequest_file\tstatus\n' > "$events_file"
|
||||||
|
|
||||||
|
cycle=0
|
||||||
|
while :; do
|
||||||
|
cycle=$((cycle + 1))
|
||||||
|
echo "sxcp_eval_loop: cycle $cycle waiting ${minutes} minute(s) before requesting evaluation..."
|
||||||
|
sleep "${minutes}m"
|
||||||
|
|
||||||
|
stamp="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
||||||
|
cycle_dir="$run_dir/cycle_$(printf '%03d' "$cycle")"
|
||||||
|
mkdir -p "$cycle_dir"
|
||||||
|
request_file="$cycle_dir/request.md"
|
||||||
|
|
||||||
|
cat > "$request_file" <<EOF
|
||||||
|
Please run SxCP eval cycle $cycle now.
|
||||||
|
|
||||||
|
Primary goal:
|
||||||
|
- Tune the generator for better Krea2 images, not just one isolated image.
|
||||||
|
- Maintain/update the durable Krea2 prompt guide at: $guide_file
|
||||||
|
|
||||||
|
Channels:
|
||||||
|
- Pull latest graph output from: $in_channel
|
||||||
|
- Push prompt-only replacement to: $out_channel
|
||||||
|
- Put analysis/log text in chat or: $log_channel
|
||||||
|
|
||||||
|
Evaluation steps:
|
||||||
|
1. Pull the latest payload from $in_channel.
|
||||||
|
2. Inspect image_path and compare it to the prompt text.
|
||||||
|
3. Score these Krea2 axes: identity, outfit continuity, pose/action, camera compliance, location coherence, crop/framing, prompt noise, model confusion tokens, and overall image usefulness.
|
||||||
|
4. Identify the smallest concrete mismatch that should be tested next.
|
||||||
|
5. Classify the finding:
|
||||||
|
- prompt-only: push exactly one edited prompt to $out_channel and nothing else on that channel.
|
||||||
|
- guide-rule: update $guide_file with the confirmed Krea2 wording rule.
|
||||||
|
- generator-fix: edit the responsible generator path, add/adjust focused smoke coverage, run tests, and summarize the change.
|
||||||
|
6. Keep a clear link between the image evidence, the prompt wording, and the generator path.
|
||||||
|
7. Append the finding to the eval log with: original issue, changed wording/path, expected improvement, test result, guide update, generator update, and next hypothesis.
|
||||||
|
|
||||||
|
Current run:
|
||||||
|
- run_id: $run_id
|
||||||
|
- cycle: $cycle
|
||||||
|
- generated_at_utc: $stamp
|
||||||
|
- request_file: $request_file
|
||||||
|
- guide_file: $guide_file
|
||||||
|
EOF
|
||||||
|
|
||||||
|
{
|
||||||
|
echo
|
||||||
|
echo "### Cycle $cycle - $stamp"
|
||||||
|
echo
|
||||||
|
echo "- Request: \`$request_file\`"
|
||||||
|
echo "- Status: pending evaluation"
|
||||||
|
} >> "$summary_file"
|
||||||
|
printf '%s\t%s\t%s\t%s\n' "$cycle" "$stamp" "$request_file" "pending" >> "$events_file"
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "================ SxCP Eval Request ================"
|
||||||
|
cat "$request_file"
|
||||||
|
echo "==================================================="
|
||||||
|
echo
|
||||||
|
|
||||||
|
if [ "${SXCP_EVAL_CODEX_CMD:-}" != "" ]; then
|
||||||
|
echo "sxcp_eval_loop: piping request to SXCP_EVAL_CODEX_CMD"
|
||||||
|
SXCP_EVAL_IN_CHANNEL="$in_channel" \
|
||||||
|
SXCP_EVAL_OUT_CHANNEL="$out_channel" \
|
||||||
|
SXCP_EVAL_LOG_CHANNEL="$log_channel" \
|
||||||
|
SXCP_EVAL_GUIDE_FILE="$guide_file" \
|
||||||
|
SXCP_EVAL_REQUEST_FILE="$request_file" \
|
||||||
|
SXCP_EVAL_CYCLE_DIR="$cycle_dir" \
|
||||||
|
SXCP_EVAL_CYCLE="$cycle" \
|
||||||
|
sh -c "$SXCP_EVAL_CODEX_CMD" < "$request_file"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$run_once" -eq 1 ]; then
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "sxcp_eval_loop: log written to $run_dir"
|
||||||
Reference in New Issue
Block a user