Extract formatter input parsing policy

This commit is contained in:
2026-06-27 01:22:07 +02:00
parent b54b8b9421
commit 4c45d96472
7 changed files with 239 additions and 159 deletions
@@ -62,6 +62,23 @@ route-specific owner. It also preserves ordinary words such as `composition`
inside normal sentences; empty field-label cleanup is limited to standalone
labels.
Formatter input/fallback parsing now has one home:
- `formatter_input.py`
It owns route-neutral parsing shared by Krea2, SDXL, and natural-caption
routes:
- whitespace and punctuation normalization before formatter parsing;
- JSON row detection from `metadata_json` or source text;
- trigger-prefix stripping with route-specific trigger candidate lists;
- `Avoid:` positive/negative splitting for fallback text;
- prompt field extraction such as `Setting:` or `Composition:`;
- row-value fallback from metadata fields to labeled prompt text.
It must not make formatter-style decisions. Krea prose, SDXL tags, and training
caption sentence shape stay in their formatter modules.
Shared hardcore phrase cleanup now has one home:
- `hardcore_text_cleanup.py`
@@ -242,6 +259,9 @@ Already isolated:
- `krea_pov_actions.py` owns POV hardcore action sentence rewriting,
first-person body geometry, and selected-position-axis priority before loose
context fallback.
- `formatter_input.py` owns shared metadata/source JSON detection, trigger
stripping, prompt-field extraction, `Avoid:` splitting, and row-value
fallback for Krea, SDXL, and caption routes.
Improve later:
@@ -262,6 +282,7 @@ Keep here:
- negative-prompt assembly.
- metadata-family tag hints from `action_family`, `position_family`, and
`position_keys`.
- shared formatter input parsing from `formatter_input.py`.
Improve later:
@@ -280,6 +301,7 @@ Keep here:
- training-caption trigger behavior;
- style-tail policy.
- metadata-family action labels from `action_family` and `position_family`.
- shared formatter input parsing from `formatter_input.py`.
Improve later: