Files
ComfyUI-Ethanfel-Prompt-Bui…/docs/prompt-architecture-improvement-plan.md
T
2026-06-26 23:53:34 +02:00

18 KiB

Prompt Architecture Improvement Plan

This is a working research note for organizing the prompt builder around the routing map in docs/prompt-pool-routing-map.md.

Current Branch Additions

The current branch adds two major surfaces:

  • SxCP Krea2 Resolution Selector in node_seed_resolution.py, with README notes.
  • Expanded hardcore interaction/manual/action pools in categories/sexual_poses.json, categories/expression_composition_pools.json, prompt_builder.py, and krea_formatter.py.

The map audit currently sees:

  • 15 sexual pose subcategories.
  • 94 sexual pose item templates.
  • 23 expression pools.
  • 24 composition pools.
  • A new Krea2 resolution node with width/height/API aspect outputs.

Architectural Finding

The project has a good functional map, but ownership is still mixed inside large files:

  • prompt_builder.py owns selection, character resolution, role graph logic, camera adaptation, pair assembly, and some final string cleanup.
  • krea_formatter.py owns metadata parsing, cast naturalization, sexual action rewriting, POV rewriting, clothing cleanup, camera preservation, fallback parsing, and final prose assembly.
  • sdxl_formatter.py owns tag assembly and style/quality presets.
  • caption_naturalizer.py owns training-caption prose.
  • Category JSON files own scalable pool content, but Python still owns several compatibility and role-graph decisions.

The biggest maintainability risk is not the number of pools. The risk is that selection, semantic rewriting, and final text hygiene are too interleaved. When a prompt has wrong text, it is easy to patch the wrong layer.

First Refactor Boundary

Generic text hygiene now has one home:

  • prompt_hygiene.py

It should only handle route-agnostic cleanup:

  • whitespace and punctuation normalization;
  • empty field-label removal;
  • repeated trigger prefix cleanup;
  • duplicate comma-list item removal;
  • adjacent duplicate sentence cleanup;
  • simple dangling connector cleanup.

It must not make semantic decisions such as sexual action positioning, POV geometry, clothing state, or model-specific tag weighting. Those stay in the route-specific owner. It also preserves ordinary words such as composition inside normal sentences; empty field-label cleanup is limited to standalone labels.

Shared hardcore phrase cleanup now has one home:

  • hardcore_text_cleanup.py

It owns environment-anchor normalization used by both prompt generation and Krea formatting, including malformed surface joins and bed/sheet/couch anchors that should become model-neutral body-support language. It must stay route-neutral: no Krea prose, no SDXL tags, and no category selection logic.

Current integration points:

  • prompt_builder.build_prompt
  • prompt_builder.build_insta_of_pair
  • krea_formatter.format_krea2_prompt
  • sdxl_formatter.format_sdxl_prompt
  • caption_naturalizer.naturalize_caption

Target Organization

Generation Layer

Owner: prompt_builder.py plus categories/*.json.

Keep here:

  • category/subcategory/item selection;
  • seed axis routing;
  • character slot/profile resolution;
  • scene/expression/composition pool selection;
  • role graph creation from structured category axes;
  • metadata row construction.

Move or isolate later:

  • pair assembly and camera mutation helpers that still live in prompt_builder.py.

Already isolated:

  • JSON category loading, subcategory normalization, named scene/expression/ composition pool loading, cast compatibility filtering, exact subcategory lookup, and inheritance-based pool merging live in category_library.py.
  • hardcore configured-cast role graph generation lives in hardcore_role_graphs.py; prompt_builder.py selects item/axis metadata and then asks that module for the source role graph.
  • fallback role graph wording lives in hardcore_role_fallback.py, covering solo rows, women-only rows, men-only rows, mixed group fallbacks, and support partner sentences.
  • interaction-style role graph wording lives in hardcore_role_interaction.py, covering foreplay, manual stimulation, body worship, clothing transitions, dominant guidance, camera performance, aftercare, and group coordination.
  • outercourse-specific role graph wording has started moving into action-family modules; hardcore_role_outercourse.py owns boobjob, testicle-sucking, penis-licking, handjob, and footjob body geometry.
  • oral-specific role graph wording lives in hardcore_role_oral.py, including direct POV viewer phrasing for kneeling, face-sitting, sixty-nine, edge-supported, side-lying, chair, standing, and reclining oral positions.
  • penetration-specific role graph wording lives in hardcore_role_penetration.py, covering the main vaginal penetration position families while Krea POV rewriting keeps first-person geometry stable.
  • anal/double-contact role graph wording lives in hardcore_role_anal.py, covering rear-entry anal variants and front/back double-contact source geometry.
  • climax role graph wording lives in hardcore_role_climax.py, covering ejaculation aftermath placement for face/body/ass, lap, open-thigh, side-lying, and front/back group layouts.
  • camera option schema, orbit/Qwen translation, config parsing, camera directive text, and camera caption text live in camera_config.py; camera-scene prose and coworking composition adaptation live in scene_camera_adapters.py; prompt_builder.py still owns row mutation.
  • shared hardcore environment-anchor cleanup lives in hardcore_text_cleanup.py and normalizes malformed pool joins before metadata reaches formatter routes.
  • shared hardcore action metadata lives in hardcore_action_metadata.py; custom rows now emit action_family, position_family, position_key, and position_keys so formatter routing and debugging do less keyword guessing. Krea, SDXL, and training-caption routes consume these fields when present.

Pair / Adapter Layer

Owner today: build_insta_of_pair.

Keep here:

  • pair route sequencing;
  • top-level continuity option handoff between row, camera, clothing, and output adapters.

Already isolated:

  • Insta/OF option normalization, softcore category/outfit/pose pools, partner outfit pools, clothing-continuity labels, negatives, and hardcore cast count policy live in pair_options.py; prompt_builder.py keeps public delegate wrappers for existing nodes and tests.
  • soft/hard row creation lives in pair_rows.py, including softcore expression override resolution, Woman A slot context application, soft outfit/pose overrides, POV row fields, and hardcore row creation.
  • pair-level cast/display context lives in pair_cast.py, including shared descriptors, same-cast softcore descriptor text, partner styling, platform and level labels, softcore cast presence text, and hard cast summary text.
  • pair-level camera routing lives in pair_camera.py, including soft/hard camera config selection, same-as-softcore mode, camera-detail override, same-room hard scene continuity, camera-aware composition mutation, POV camera suppression, and row/root camera metadata synchronization.
  • pair-level hardcore clothing continuity lives in pair_clothing.py, including action-aware body-access flags, conflicting outfit-piece cleanup, default visible-men clothing, character-clothing override handling, and final root clothing-state assembly.
  • final pair output assembly lives in pair_output.py, including soft/hard prompt strings, trigger preservation, negatives, captions, and root metadata shape.

Krea2 Formatter Path

Owner: krea_formatter.py.

Keep here:

  • Krea prose style;
  • Krea route orchestration;
  • camera-scene preservation;
  • fallback text parsing.

Already isolated:

  • krea_cast.py owns cast descriptor parsing, cast prose, label joining, and natural label replacement for formatter routes.
  • krea_clothing.py owns clothing-state cleanup and action-aware body-access wording for formatter routes.
  • krea_action_context.py owns shared action-family predicates, axis context text, climax detection, and detail-density normalization used by action and POV formatter routes.
  • hardcore_action_metadata.py owns shared action-family constants, normalization, and inference used by the builder and Krea formatter route.
  • krea_pov.py owns POV labels, POV label filtering, and POV camera/composition support text.
  • krea_detail.py owns generic detail-clause splitting, deduping, joining, and density limiting for Krea action prose.
  • krea_action_positions.py owns non-POV pose anchors, body-arrangement text, rear-entry detection, and action-position phrasing.
  • krea_action_details.py owns non-climax item/detail cleanup for foreplay, outercourse, oral, penetration, toy/double-contact, and anchor dedupe paths.
  • krea_action_climax.py owns climax-specific role/detail cleanup and aftermath view dedupe.
  • krea_action_dispatch.py owns non-POV role normalization, action-family classification, and family-specific detail cleanup.
  • krea_actions.py owns final non-POV hardcore action sentence assembly.
  • krea_pov_actions.py owns POV hardcore action sentence rewriting, first-person body geometry, and selected-position-axis priority before loose context fallback.

Improve later:

  • extend SDXL and caption routes to optionally consume action_family / position_family when ordering tags or caption clauses;
  • add route-level smoke fixtures for representative metadata rows;

SDXL Formatter Path

Owner: sdxl_formatter.py.

Keep here:

  • trigger behavior;
  • style and quality presets;
  • tag ordering;
  • weighted explicit tags;
  • negative-prompt assembly.
  • metadata-family tag hints from action_family, position_family, and position_keys.

Improve later:

  • move presets into data dictionaries or JSON so adding styles does not require editing formatter logic;
  • add formatter profiles for Pony, SDXL photo, and flat vector;
  • make fallback cleanup use the shared field-label inventory.

Naturalizer Path

Owner: caption_naturalizer.py.

Keep here:

  • natural sentence caption assembly;
  • training-caption trigger behavior;
  • style-tail policy.
  • metadata-family action labels from action_family and position_family.

Improve later:

  • share more metadata readers with Krea without sharing Krea prose;
  • add a caption_profile option for concise/dense LoRA caption styles.

Category JSON Path

Owner: categories/*.json.

Keep here:

  • scalable prompt pool content;
  • named scene/expression/composition pools;
  • item templates and axes;
  • direct category-specific wording.

Improve later:

  • introduce optional family and action_type fields on item templates so Python filters do less keyword guessing;
  • add formatter_hint fields only where needed, not globally;
  • keep tools/prompt_map_audit.py passing; it now checks referenced expression/composition/scene pools and item-template axes.

Node / UI Path

Owner: __init__.py, node_builder.py, node_seed_resolution.py, node_camera.py, node_character.py, node_hardcore_position.py, node_formatter.py, node_insta.py, node_route_config.py, node_profile_filter.py, loop_nodes.py, web/*.js.

Keep here:

  • ComfyUI node input/output declarations;
  • widget behavior;
  • button actions;
  • dynamic input slots.
  • direct and config-driven builder node declarations in node_builder.py.
  • seed and resolution utility node declarations in node_seed_resolution.py.
  • camera utility node declarations in node_camera.py.
  • character pool, slot, and profile node declarations in node_character.py.
  • hardcore position pool/filter node declarations in node_hardcore_position.py.
  • caption/Krea2/SDXL formatter node declarations in node_formatter.py.
  • Insta/OF options and prompt-pair node declarations in node_insta.py.
  • route/category/location/composition/cast config node declarations in node_route_config.py.
  • profile/filter/ethnicity-list node declarations in node_profile_filter.py.

Already isolated:

  • direct and config-driven prompt builder nodes live in node_builder.py, with registration maps imported by __init__.py.
  • seed/global-seed/seed-locker and SDXL/Krea2 resolution utility nodes live in node_seed_resolution.py, with registration maps imported by __init__.py.
  • camera/orbit/Qwen translator utility nodes live in node_camera.py, using camera_config.py for option lists and JSON builders, with registration maps imported by __init__.py.
  • hair, age/body/eyes/clothing pools, manual character details, character slots, and profile save/load nodes live in node_character.py, with registration maps imported by __init__.py.
  • hardcore position pool and action filter nodes live in node_hardcore_position.py, with registration maps imported by __init__.py.
  • caption naturalizer, Krea2 formatter, and SDXL formatter nodes live in node_formatter.py, with registration maps imported by __init__.py.
  • Insta/OF options and dual prompt-pair nodes live in node_insta.py, with registration maps imported by __init__.py.
  • category preset, location/composition pool, location theme, and cast config utility nodes live in node_route_config.py, with registration maps imported by __init__.py.
  • generation profile, advanced filter, and ethnicity list utility nodes live in node_profile_filter.py, with registration maps imported by __init__.py.

Improve later:

  • split remaining large node classes into files by family;
  • keep node display names, return names, and docs in sync through the audit helper;
  • add small endpoint tests for profile/accumulator/index-switch routes.

Path-Specific Improvements

Prompt Builder

Near-term:

  • Add final row hygiene already done through prompt_hygiene.py.
  • Add a metadata smoke checker for representative generated rows and static formatter fixtures through tools/prompt_smoke.py.
  • Normalize every row with one function before JSON serialization.

Medium-term:

  • Extract category loading and role graph logic.
  • Convert keyword-heavy interaction filtering to template metadata.

Insta/OF Pair

Near-term:

  • Normalize pair metadata with one helper.
  • Confirm pair prompts, captions, and soft/hard rows carry the same sanitized scene/camera/clothing fields.
  • Keep same-room pair continuity synchronized in both assembled prompt text and hardcore_row.scene_text; tools/prompt_smoke.py covers this drift case.

Medium-term:

  • Make pair camera and clothing phases explicit subfunctions.
  • Add smoke fixtures for same-cast, POV man, explicit nude, and different-camera modes.

Krea2

Near-term:

  • Add final prose hygiene already done through prompt_hygiene.py.
  • Add smoke coverage through tools/prompt_smoke.py for metadata-driven Krea2 formatting across built-in rows, hardcore rows, same-cast pairs, and POV pairs.
  • Cover camera-scene preservation through tools/prompt_smoke.py for single rows, split soft/hard pair cameras, and POV camera-scene routing.
  • Cover config-node routing through tools/prompt_smoke.py for category, cast, generation profile, seed lock, camera, location theme, and composition config.
  • Cover close foreplay and POV penetration Krea routes so raw labels, invalid surface grammar, normal third-person camera text, and composition punctuation drift are caught.
  • Cover POV outercourse, oral, penetration, anal, and front/back double-contact Krea routes so selected position geometry stays synchronized with metadata.
  • Cover generated climax routes through Krea, SDXL, and natural caption outputs so source aftermath placement and formatter details cannot drift apart.
  • Cover generated interaction routes through Krea, SDXL, and natural caption outputs so source contact/guidance/presentation wording stays metadata-driven.
  • Cover generated fallback role routes through Krea, SDXL, and natural caption outputs so solo and same-sex paths do not remain untested edge behavior.

Medium-term:

  • Dispatch action rewriting by action family.
  • Continue splitting remaining Krea semantic helpers into smaller modules.

SDXL

Near-term:

  • Add final tag hygiene already done through prompt_hygiene.py.
  • Add smoke tests for trigger preservation and duplicate tag removal through tools/prompt_smoke.py.

Medium-term:

  • Make style/quality presets data-driven.

Naturalizer

Near-term:

  • Add final prose hygiene already done through prompt_hygiene.py.
  • Verify training captions keep trigger exactly once through tools/prompt_smoke.py.

Medium-term:

  • Add caption profiles for training and browsing use cases.

Camera / Scene

Near-term:

  • Keep Qwen/orbit as camera source.
  • Keep scene-camera adapters scoped by location family.
  • Use the memory note in /home/ethanfel/.codex/memories/scene-camera-system.md when editing POV.
  • Keep scene_camera_adapters.py as the owner for location-aware camera prose; add new location families there one at a time.

Medium-term:

  • Build new adapters one location family at a time.

Invariants To Preserve

  • Metadata is the preferred formatter input.
  • Prompt Builder should output structured rows even if raw prompt text is rough.
  • Krea should fix prose and semantic action readability, not category selection.
  • SDXL should produce tag-style output and preserve model triggers as requested.
  • Naturalizer should output training-friendly captions without changing the selected content.
  • Generic cleanup belongs in prompt_hygiene.py; semantic cleanup belongs in the owning route.
  1. Continue splitting remaining __init__.py node classes by family after behavior is covered by smoke checks.
  2. Continue splitting the internals of hardcore_role_graphs.py by action family once generated edge cases are covered by smoke fixtures.
  3. Add more route-level smoke fixtures for generated edge cases that are not covered by the current static Krea/SDXL/caption metadata fixtures.