Files
ComfyUI-Ethanfel-Prompt-Bui…/docs/prompt-architecture-improvement-plan.md
T

339 lines
10 KiB
Markdown

# Prompt Architecture Improvement Plan
This is a working research note for organizing the prompt builder around the
routing map in `docs/prompt-pool-routing-map.md`.
## Current Branch Additions
The current branch adds two major surfaces:
- `SxCP Krea2 Resolution Selector` in `__init__.py`, with README notes.
- Expanded hardcore interaction/manual/action pools in
`categories/sexual_poses.json`,
`categories/expression_composition_pools.json`, `prompt_builder.py`, and
`krea_formatter.py`.
The map audit currently sees:
- 15 sexual pose subcategories.
- 94 sexual pose item templates.
- 23 expression pools.
- 24 composition pools.
- A new Krea2 resolution node with width/height/API aspect outputs.
## Architectural Finding
The project has a good functional map, but ownership is still mixed inside large
files:
- `prompt_builder.py` owns selection, character resolution, role graph logic,
camera adaptation, pair assembly, and some final string cleanup.
- `krea_formatter.py` owns metadata parsing, cast naturalization, sexual action
rewriting, POV rewriting, clothing cleanup, camera preservation, fallback
parsing, and final prose assembly.
- `sdxl_formatter.py` owns tag assembly and style/quality presets.
- `caption_naturalizer.py` owns training-caption prose.
- Category JSON files own scalable pool content, but Python still owns several
compatibility and role-graph decisions.
The biggest maintainability risk is not the number of pools. The risk is that
selection, semantic rewriting, and final text hygiene are too interleaved. When a
prompt has wrong text, it is easy to patch the wrong layer.
## First Refactor Boundary
Generic text hygiene now has one home:
- `prompt_hygiene.py`
It should only handle route-agnostic cleanup:
- whitespace and punctuation normalization;
- empty field-label removal;
- repeated trigger prefix cleanup;
- duplicate comma-list item removal;
- adjacent duplicate sentence cleanup;
- simple dangling connector cleanup.
It must not make semantic decisions such as sexual action positioning, POV
geometry, clothing state, or model-specific tag weighting. Those stay in the
route-specific owner. It also preserves ordinary words such as `composition`
inside normal sentences; empty field-label cleanup is limited to standalone
labels.
Shared hardcore phrase cleanup now has one home:
- `hardcore_text_cleanup.py`
It owns environment-anchor normalization used by both prompt generation and
Krea formatting, including malformed surface joins and bed/sheet/couch anchors
that should become model-neutral body-support language. It must stay
route-neutral: no Krea prose, no SDXL tags, and no category selection logic.
Current integration points:
- `prompt_builder.build_prompt`
- `prompt_builder.build_insta_of_pair`
- `krea_formatter.format_krea2_prompt`
- `sdxl_formatter.format_sdxl_prompt`
- `caption_naturalizer.naturalize_caption`
## Target Organization
### Generation Layer
Owner: `prompt_builder.py` plus `categories/*.json`.
Keep here:
- category/subcategory/item selection;
- seed axis routing;
- character slot/profile resolution;
- scene/expression/composition pool selection;
- role graph creation from structured category axes;
- metadata row construction.
Move or isolate later:
- role graph generation for hardcore interaction categories into a dedicated
module, for example `hardcore_role_graphs.py`;
- category-library loading and inheritance helpers into `category_library.py`.
Already isolated:
- camera-scene prose and coworking composition adaptation live in
`scene_camera_adapters.py`; `prompt_builder.py` still owns camera config
parsing and row mutation.
- shared hardcore environment-anchor cleanup lives in
`hardcore_text_cleanup.py` and normalizes malformed pool joins before metadata
reaches formatter routes.
### Pair / Adapter Layer
Owner today: `build_insta_of_pair`.
Keep here:
- soft/hard row creation;
- continuity policy;
- softcore cast policy;
- pair-level camera routing;
- pair metadata shape.
Improve later:
- make a single pair metadata sanitizer that normalizes `softcore_row`,
`hardcore_row`, pair prompts, negatives, captions, and camera fields;
- split pair assembly into small functions by phase:
`build_soft_row`, `build_hard_row`, `resolve_pair_camera`,
`resolve_pair_clothing`, `assemble_pair_metadata`.
### Krea2 Formatter Path
Owner: `krea_formatter.py`.
Keep here:
- Krea prose style;
- hardcore action sentence rewriting;
- POV sentence rewriting;
- clothing naturalization;
- camera-scene preservation;
- fallback text parsing.
Already isolated:
- `krea_cast.py` owns cast descriptor parsing, cast prose, label joining, and
natural label replacement for formatter routes.
Improve later:
- split semantic blocks into modules:
`krea_actions.py`, `krea_pov.py`, `krea_clothing.py`;
- add route-level smoke fixtures for representative metadata rows;
- make `_hardcore_action_sentence` dispatch by action family instead of long
conditional chains.
### SDXL Formatter Path
Owner: `sdxl_formatter.py`.
Keep here:
- trigger behavior;
- style and quality presets;
- tag ordering;
- weighted explicit tags;
- negative-prompt assembly.
Improve later:
- move presets into data dictionaries or JSON so adding styles does not require
editing formatter logic;
- add formatter profiles for Pony, SDXL photo, and flat vector;
- make fallback cleanup use the shared field-label inventory.
### Naturalizer Path
Owner: `caption_naturalizer.py`.
Keep here:
- natural sentence caption assembly;
- training-caption trigger behavior;
- style-tail policy.
Improve later:
- share more metadata readers with Krea without sharing Krea prose;
- add a `caption_profile` option for concise/dense LoRA caption styles.
### Category JSON Path
Owner: `categories/*.json`.
Keep here:
- scalable prompt pool content;
- named scene/expression/composition pools;
- item templates and axes;
- direct category-specific wording.
Improve later:
- introduce optional `family` and `action_type` fields on item templates so
Python filters do less keyword guessing;
- add `formatter_hint` fields only where needed, not globally;
- keep `tools/prompt_map_audit.py` passing; it now checks referenced
expression/composition/scene pools and item-template axes.
### Node / UI Path
Owner: `__init__.py`, `loop_nodes.py`, `web/*.js`.
Keep here:
- ComfyUI node input/output declarations;
- widget behavior;
- button actions;
- dynamic input slots.
Improve later:
- split large node classes into files by family;
- keep node display names, return names, and docs in sync through the audit
helper;
- add small endpoint tests for profile/accumulator/index-switch routes.
## Path-Specific Improvements
### Prompt Builder
Near-term:
- Add final row hygiene already done through `prompt_hygiene.py`.
- Add a metadata smoke checker for representative rows through
`tools/prompt_smoke.py`.
- Normalize every row with one function before JSON serialization.
Medium-term:
- Extract category loading and role graph logic.
- Convert keyword-heavy interaction filtering to template metadata.
### Insta/OF Pair
Near-term:
- Normalize pair metadata with one helper.
- Confirm pair prompts, captions, and soft/hard rows carry the same sanitized
scene/camera/clothing fields.
- Keep same-room pair continuity synchronized in both assembled prompt text and
`hardcore_row.scene_text`; `tools/prompt_smoke.py` covers this drift case.
Medium-term:
- Make pair camera and clothing phases explicit subfunctions.
- Add smoke fixtures for same-cast, POV man, explicit nude, and different-camera
modes.
### Krea2
Near-term:
- Add final prose hygiene already done through `prompt_hygiene.py`.
- Add smoke coverage through `tools/prompt_smoke.py` for metadata-driven Krea2
formatting across built-in rows, hardcore rows, same-cast pairs, and POV
pairs.
- Cover camera-scene preservation through `tools/prompt_smoke.py` for single
rows, split soft/hard pair cameras, and POV camera-scene routing.
- Cover config-node routing through `tools/prompt_smoke.py` for category, cast,
generation profile, seed lock, camera, location theme, and composition config.
- Cover close foreplay and POV penetration Krea routes so raw labels, invalid
surface grammar, normal third-person camera text, and composition punctuation
drift are caught.
Medium-term:
- Dispatch action rewriting by action family.
- Split Krea semantic helpers into smaller modules.
### SDXL
Near-term:
- Add final tag hygiene already done through `prompt_hygiene.py`.
- Add smoke tests for trigger preservation and duplicate tag removal through
`tools/prompt_smoke.py`.
Medium-term:
- Make style/quality presets data-driven.
### Naturalizer
Near-term:
- Add final prose hygiene already done through `prompt_hygiene.py`.
- Verify training captions keep trigger exactly once through
`tools/prompt_smoke.py`.
Medium-term:
- Add caption profiles for training and browsing use cases.
### Camera / Scene
Near-term:
- Keep Qwen/orbit as camera source.
- Keep scene-camera adapters scoped by location family.
- Use the memory note in
`/home/ethanfel/.codex/memories/scene-camera-system.md` when editing POV.
- Keep `scene_camera_adapters.py` as the owner for location-aware camera prose;
add new location families there one at a time.
Medium-term:
- Build new adapters one location family at a time.
## Invariants To Preserve
- Metadata is the preferred formatter input.
- Prompt Builder should output structured rows even if raw prompt text is rough.
- Krea should fix prose and semantic action readability, not category selection.
- SDXL should produce tag-style output and preserve model triggers as requested.
- Naturalizer should output training-friendly captions without changing the
selected content.
- Generic cleanup belongs in `prompt_hygiene.py`; semantic cleanup belongs in
the owning route.
## Recommended Next Passes
1. Split Krea action/POV/clothing helpers into separate modules, using
`krea_cast.py` as the pattern for stable import aliases and smoke coverage.
2. Split `__init__.py` node classes by family after behavior is covered by smoke
checks.
3. Add metadata fields such as `action_family` / `position_family` to reduce
keyword guessing in hardcore filters and formatter dispatch.