Add prompt hygiene architecture pass
This commit is contained in:
@@ -0,0 +1,301 @@
|
||||
# Prompt Architecture Improvement Plan
|
||||
|
||||
This is a working research note for organizing the prompt builder around the
|
||||
routing map in `docs/prompt-pool-routing-map.md`.
|
||||
|
||||
## Current Branch Additions
|
||||
|
||||
The current branch adds two major surfaces:
|
||||
|
||||
- `SxCP Krea2 Resolution Selector` in `__init__.py`, with README notes.
|
||||
- Expanded hardcore interaction/manual/action pools in
|
||||
`categories/sexual_poses.json`,
|
||||
`categories/expression_composition_pools.json`, `prompt_builder.py`, and
|
||||
`krea_formatter.py`.
|
||||
|
||||
The map audit currently sees:
|
||||
|
||||
- 15 sexual pose subcategories.
|
||||
- 94 sexual pose item templates.
|
||||
- 23 expression pools.
|
||||
- 24 composition pools.
|
||||
- A new Krea2 resolution node with width/height/API aspect outputs.
|
||||
|
||||
## Architectural Finding
|
||||
|
||||
The project has a good functional map, but ownership is still mixed inside large
|
||||
files:
|
||||
|
||||
- `prompt_builder.py` owns selection, character resolution, role graph logic,
|
||||
camera adaptation, pair assembly, and some final string cleanup.
|
||||
- `krea_formatter.py` owns metadata parsing, cast naturalization, sexual action
|
||||
rewriting, POV rewriting, clothing cleanup, camera preservation, fallback
|
||||
parsing, and final prose assembly.
|
||||
- `sdxl_formatter.py` owns tag assembly and style/quality presets.
|
||||
- `caption_naturalizer.py` owns training-caption prose.
|
||||
- Category JSON files own scalable pool content, but Python still owns several
|
||||
compatibility and role-graph decisions.
|
||||
|
||||
The biggest maintainability risk is not the number of pools. The risk is that
|
||||
selection, semantic rewriting, and final text hygiene are too interleaved. When a
|
||||
prompt has wrong text, it is easy to patch the wrong layer.
|
||||
|
||||
## First Refactor Boundary
|
||||
|
||||
Generic text hygiene now has one home:
|
||||
|
||||
- `prompt_hygiene.py`
|
||||
|
||||
It should only handle route-agnostic cleanup:
|
||||
|
||||
- whitespace and punctuation normalization;
|
||||
- empty field-label removal;
|
||||
- repeated trigger prefix cleanup;
|
||||
- duplicate comma-list item removal;
|
||||
- adjacent duplicate sentence cleanup;
|
||||
- simple dangling connector cleanup.
|
||||
|
||||
It must not make semantic decisions such as sexual action positioning, POV
|
||||
geometry, clothing state, or model-specific tag weighting. Those stay in the
|
||||
route-specific owner.
|
||||
|
||||
Current integration points:
|
||||
|
||||
- `prompt_builder.build_prompt`
|
||||
- `prompt_builder.build_insta_of_pair`
|
||||
- `krea_formatter.format_krea2_prompt`
|
||||
- `sdxl_formatter.format_sdxl_prompt`
|
||||
- `caption_naturalizer.naturalize_caption`
|
||||
|
||||
## Target Organization
|
||||
|
||||
### Generation Layer
|
||||
|
||||
Owner: `prompt_builder.py` plus `categories/*.json`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- category/subcategory/item selection;
|
||||
- seed axis routing;
|
||||
- character slot/profile resolution;
|
||||
- scene/expression/composition pool selection;
|
||||
- role graph creation from structured category axes;
|
||||
- metadata row construction.
|
||||
|
||||
Move or isolate later:
|
||||
|
||||
- role graph generation for hardcore interaction categories into a dedicated
|
||||
module, for example `hardcore_role_graphs.py`;
|
||||
- camera-scene adapters into `scene_camera_adapters.py`;
|
||||
- category-library loading and inheritance helpers into `category_library.py`.
|
||||
|
||||
### Pair / Adapter Layer
|
||||
|
||||
Owner today: `build_insta_of_pair`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- soft/hard row creation;
|
||||
- continuity policy;
|
||||
- softcore cast policy;
|
||||
- pair-level camera routing;
|
||||
- pair metadata shape.
|
||||
|
||||
Improve later:
|
||||
|
||||
- make a single pair metadata sanitizer that normalizes `softcore_row`,
|
||||
`hardcore_row`, pair prompts, negatives, captions, and camera fields;
|
||||
- split pair assembly into small functions by phase:
|
||||
`build_soft_row`, `build_hard_row`, `resolve_pair_camera`,
|
||||
`resolve_pair_clothing`, `assemble_pair_metadata`.
|
||||
|
||||
### Krea2 Formatter Path
|
||||
|
||||
Owner: `krea_formatter.py`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- Krea prose style;
|
||||
- cast prose;
|
||||
- hardcore action sentence rewriting;
|
||||
- POV sentence rewriting;
|
||||
- clothing naturalization;
|
||||
- camera-scene preservation;
|
||||
- fallback text parsing.
|
||||
|
||||
Improve later:
|
||||
|
||||
- split semantic blocks into modules:
|
||||
`krea_cast.py`, `krea_actions.py`, `krea_pov.py`, `krea_clothing.py`;
|
||||
- add route-level smoke fixtures for representative metadata rows;
|
||||
- make `_hardcore_action_sentence` dispatch by action family instead of long
|
||||
conditional chains.
|
||||
|
||||
### SDXL Formatter Path
|
||||
|
||||
Owner: `sdxl_formatter.py`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- trigger behavior;
|
||||
- style and quality presets;
|
||||
- tag ordering;
|
||||
- weighted explicit tags;
|
||||
- negative-prompt assembly.
|
||||
|
||||
Improve later:
|
||||
|
||||
- move presets into data dictionaries or JSON so adding styles does not require
|
||||
editing formatter logic;
|
||||
- add formatter profiles for Pony, SDXL photo, and flat vector;
|
||||
- make fallback cleanup use the shared field-label inventory.
|
||||
|
||||
### Naturalizer Path
|
||||
|
||||
Owner: `caption_naturalizer.py`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- natural sentence caption assembly;
|
||||
- training-caption trigger behavior;
|
||||
- style-tail policy.
|
||||
|
||||
Improve later:
|
||||
|
||||
- share more metadata readers with Krea without sharing Krea prose;
|
||||
- add a `caption_profile` option for concise/dense LoRA caption styles.
|
||||
|
||||
### Category JSON Path
|
||||
|
||||
Owner: `categories/*.json`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- scalable prompt pool content;
|
||||
- named scene/expression/composition pools;
|
||||
- item templates and axes;
|
||||
- direct category-specific wording.
|
||||
|
||||
Improve later:
|
||||
|
||||
- introduce optional `family` and `action_type` fields on item templates so
|
||||
Python filters do less keyword guessing;
|
||||
- add `formatter_hint` fields only where needed, not globally;
|
||||
- add a JSON audit that checks every referenced expression/composition/scene pool
|
||||
exists.
|
||||
|
||||
### Node / UI Path
|
||||
|
||||
Owner: `__init__.py`, `loop_nodes.py`, `web/*.js`.
|
||||
|
||||
Keep here:
|
||||
|
||||
- ComfyUI node input/output declarations;
|
||||
- widget behavior;
|
||||
- button actions;
|
||||
- dynamic input slots.
|
||||
|
||||
Improve later:
|
||||
|
||||
- split large node classes into files by family;
|
||||
- keep node display names, return names, and docs in sync through the audit
|
||||
helper;
|
||||
- add small endpoint tests for profile/accumulator/index-switch routes.
|
||||
|
||||
## Path-Specific Improvements
|
||||
|
||||
### Prompt Builder
|
||||
|
||||
Near-term:
|
||||
|
||||
- Add final row hygiene already done through `prompt_hygiene.py`.
|
||||
- Add a metadata invariant checker for rows before return.
|
||||
- Normalize every row with one function before JSON serialization.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Extract category loading and role graph logic.
|
||||
- Convert keyword-heavy interaction filtering to template metadata.
|
||||
|
||||
### Insta/OF Pair
|
||||
|
||||
Near-term:
|
||||
|
||||
- Normalize pair metadata with one helper.
|
||||
- Confirm pair prompts, captions, and soft/hard rows carry the same sanitized
|
||||
scene/camera/clothing fields.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Make pair camera and clothing phases explicit subfunctions.
|
||||
- Add smoke fixtures for same-cast, POV man, explicit nude, and different-camera
|
||||
modes.
|
||||
|
||||
### Krea2
|
||||
|
||||
Near-term:
|
||||
|
||||
- Add final prose hygiene already done through `prompt_hygiene.py`.
|
||||
- Add tests for close foreplay, POV oral, POV penetration, aftercare, manual
|
||||
stimulation, and camera-scene preservation.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Dispatch action rewriting by action family.
|
||||
- Split Krea semantic helpers into smaller modules.
|
||||
|
||||
### SDXL
|
||||
|
||||
Near-term:
|
||||
|
||||
- Add final tag hygiene already done through `prompt_hygiene.py`.
|
||||
- Add smoke tests for trigger preservation and duplicate tag removal.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Make style/quality presets data-driven.
|
||||
|
||||
### Naturalizer
|
||||
|
||||
Near-term:
|
||||
|
||||
- Add final prose hygiene already done through `prompt_hygiene.py`.
|
||||
- Verify training captions keep trigger exactly once.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Add caption profiles for training and browsing use cases.
|
||||
|
||||
### Camera / Scene
|
||||
|
||||
Near-term:
|
||||
|
||||
- Keep Qwen/orbit as camera source.
|
||||
- Keep scene-camera adapters scoped by location family.
|
||||
- Use the memory note in
|
||||
`/home/ethanfel/.codex/memories/scene-camera-system.md` when editing POV.
|
||||
|
||||
Medium-term:
|
||||
|
||||
- Move coworking adapter into a scene-camera adapter module.
|
||||
- Build new adapters one location family at a time.
|
||||
|
||||
## Invariants To Preserve
|
||||
|
||||
- Metadata is the preferred formatter input.
|
||||
- Prompt Builder should output structured rows even if raw prompt text is rough.
|
||||
- Krea should fix prose and semantic action readability, not category selection.
|
||||
- SDXL should produce tag-style output and preserve model triggers as requested.
|
||||
- Naturalizer should output training-friendly captions without changing the
|
||||
selected content.
|
||||
- Generic cleanup belongs in `prompt_hygiene.py`; semantic cleanup belongs in
|
||||
the owning route.
|
||||
|
||||
## Recommended Next Passes
|
||||
|
||||
1. Add metadata invariant checks and small smoke fixtures.
|
||||
2. Split Krea action/POV/clothing helpers into separate modules.
|
||||
3. Add category JSON pool reference validation to `tools/prompt_map_audit.py`.
|
||||
4. Extract scene-camera adapters from `prompt_builder.py`.
|
||||
5. Split `__init__.py` node classes by family after behavior is covered by smoke
|
||||
checks.
|
||||
Reference in New Issue
Block a user