Add Image Pool (Grid) node design doc
Approved design for an input-side ComfyUI node holding a curated pool of images with per-image masks and labels, selectable for output. Captures IO, managed-pool-folder storage, MaskEditor reuse, server routes, edge cases, and a 3-phase build plan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,8 @@
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
.venv/
|
||||
venv/
|
||||
*.egg-info/
|
||||
.pytest_cache/
|
||||
.DS_Store
|
||||
input/grid_pool/
|
||||
@@ -0,0 +1,129 @@
|
||||
# Image Pool (Grid) — Design
|
||||
|
||||
Date: 2026-06-21
|
||||
Status: Approved (brainstorming complete, ready for implementation plan)
|
||||
|
||||
## 1. Purpose & scope
|
||||
|
||||
An **input-side** ComfyUI node that holds a curated *pool* of images, each with its
|
||||
own remembered mask and an editable label. The user picks an active image (or drives
|
||||
it by index) and the node outputs that image + mask (+ index/count/label).
|
||||
|
||||
The node does **not** inpaint. It feeds downstream edit nodes (Klein / Flux Kontext).
|
||||
It exists to remove two recurring annoyances in the dataset/inpaint workflow:
|
||||
|
||||
1. **Rewiring** — instead of swapping `LoadImage` nodes, keep one node holding many
|
||||
images and click the active one. Downstream wiring never changes.
|
||||
2. **Re-masking** — each image remembers its own mask, so switching back to an image
|
||||
never means redrawing the mask.
|
||||
|
||||
Node name: **`Image Pool (Grid)`**.
|
||||
|
||||
### Non-goals (YAGNI for v1)
|
||||
|
||||
- Bulk folder load
|
||||
- Batch-output mode (output the whole pool at once)
|
||||
- Drag-to-reorder (deferred to phase 3)
|
||||
- Cross-machine workflow portability (pool is disk-backed, local)
|
||||
|
||||
## 2. Node IO
|
||||
|
||||
| dir | name | type | notes |
|
||||
|---|---|---|---|
|
||||
| out | `IMAGE` | IMAGE | selected slot's image |
|
||||
| out | `MASK` | MASK | selected slot's mask; all-zeros if none drawn (= nothing masked) |
|
||||
| out | `index` | INT | the slot actually used |
|
||||
| out | `count` | INT | total images in pool |
|
||||
| out | `label` | STRING | selected slot's label ("" if unset) |
|
||||
| widget | `index` | INT | `-1` = use grid-clicked active slot; `>=0` = force that slot (clamped to `0..count-1`) |
|
||||
| hidden | `pool_id` | STRING | stable UUID, generated on node create, saved in workflow |
|
||||
|
||||
Selection rule: if `index` widget `== -1`, use `manifest.active`; else use
|
||||
`clamp(index, 0, count-1)`. The actually-used index is echoed on the `index` output.
|
||||
|
||||
## 3. Storage — managed pool folder
|
||||
|
||||
```
|
||||
input/grid_pool/<pool_id>/
|
||||
manifest.json
|
||||
img_0001.png img_0001.mask.png
|
||||
img_0002.png ...
|
||||
```
|
||||
|
||||
`manifest.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"active": 0,
|
||||
"slots": [
|
||||
{ "image": "img_0001.png", "mask": "img_0001.mask.png", "label": "", "added": 1718960000 }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- Workflow JSON stores only `pool_id` + the `index` widget → stays tiny.
|
||||
- Pool survives restart (lives in ComfyUI's `input/`).
|
||||
- `mask` may be `null`/absent until a mask is drawn.
|
||||
|
||||
## 4. Components & data flow
|
||||
|
||||
### Python node (`__init__.py`)
|
||||
- On execute: resolve pool dir from `pool_id`, read `manifest.json`, choose slot,
|
||||
load image + mask → tensors, return `(IMAGE, MASK, index, count, label)`.
|
||||
- Empty pool → return a 1x1 black image + zero mask + `count=0` (never crash the graph).
|
||||
- `IS_CHANGED` returns a hash of `(pool_id, chosen_index, image_mtime, mask_mtime)` so
|
||||
that editing a mask or replacing an image forces re-execution (otherwise ComfyUI
|
||||
caches the output and the new mask is never seen).
|
||||
- Mask convention: load as single-channel float; if no mask file, emit zeros matching
|
||||
image H×W.
|
||||
|
||||
### JS extension (`web/`)
|
||||
A resizable **in-node DOM widget** rendering the thumbnail grid (scrollable so a big
|
||||
pool doesn't blow up the node). Responsibilities:
|
||||
|
||||
- **Ingest**: paste (Ctrl+V on node), drag-drop files, upload button → POST to
|
||||
`/grid_pool/add`; server copies into the pool, appends a slot, returns manifest.
|
||||
- **Select**: click thumbnail → `/grid_pool/active`, highlight active.
|
||||
- **Mask**: brush button / double-click → push the slot image into `ComfyApp.clipspace`
|
||||
(`imgs`/`images`/`selectedIndex`), set `ComfyApp.clipspace_return_node = node`, call
|
||||
`openMaskEditor()`. The editor saves the alpha mask via `/upload/mask`; on return the
|
||||
node's `pasteFromClipspace()` fires → we extract the alpha and POST it to
|
||||
`/grid_pool/set_mask` to write the slot's `.mask.png`.
|
||||
- **Label**: inline-editable caption under each thumbnail → `/grid_pool/label`.
|
||||
- **Delete**: ✕ on thumbnail → `/grid_pool/remove`.
|
||||
- Badges: active border, "has-mask" dot, slot index.
|
||||
|
||||
### Server routes (aiohttp, registered from Python)
|
||||
Under `/grid_pool/*`: `add`, `remove`, `active`, `set_mask`, `label`, `list`.
|
||||
All mutate `manifest.json` atomically and return the updated manifest.
|
||||
|
||||
## 5. UI approach
|
||||
|
||||
Chosen: **in-node grid** (thumbnails in the node body; resize node to see more,
|
||||
scroll for large pools). Rejected alternative: modal "manage pool" gallery — better for
|
||||
huge pools but more clicks and more UI code; revisit only if pools get large.
|
||||
|
||||
## 6. Edge cases & error handling
|
||||
|
||||
- Empty pool → 1x1 black image + zero mask + `count=0`.
|
||||
- `index >= count` → clamp; echo the clamped value on `index` output.
|
||||
- Missing/corrupt manifest → rebuild from files on disk.
|
||||
- Cloning a node copies `pool_id` → both nodes share one pool. Provide right-click
|
||||
**"Detach pool (new id)"** to split. v1 may just document the behavior.
|
||||
- MaskEditor integration verified against the installed frontend: `openMaskEditor`,
|
||||
`copyToClipspace`/`pasteFromClipspace`, `clipspace_return_node`, and `/upload/mask`
|
||||
all exist. This is the standard "Open in MaskEditor" pattern used by many nodes.
|
||||
|
||||
## 7. Phasing & testing
|
||||
|
||||
- **Phase 1** — storage + manifest + Python node + server routes + grid display +
|
||||
select + delete + labels (no masking). E2E: add images, pick one, it outputs
|
||||
image/mask(zeros)/index/count/label.
|
||||
- **Phase 2** — MaskEditor integration + per-slot mask persistence + `IS_CHANGED`.
|
||||
- **Phase 3** — polish: drag-reorder, "detach pool", badges.
|
||||
|
||||
Testing:
|
||||
- pytest (Python): manifest read/write, atomic mutation, slot selection rule, tensor
|
||||
shapes/dtypes, zero-mask fallback, `IS_CHANGED` hashing, manifest rebuild.
|
||||
- Manual checklist (JS/MaskEditor): ingest paths, select, mask round-trip, label edit,
|
||||
delete, persistence across restart.
|
||||
Reference in New Issue
Block a user