When starting a scan export batch, delete old scan_export entries for
the same file+profile before writing new ones. Logs a warning when
replacing. Prevents stale entry buildup from repeated scan exports.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs caused vid number collisions (multiple files sharing a vid_NNN):
1. "First gap" assignment (n=1; while vid_n in existing: n++) would
reuse deleted vid numbers. Changed to max(existing) + 1 so numbers
always increase.
2. LIMIT 1 without ORDER BY returned arbitrary rows when a file had
entries in multiple vid folders. Added ORDER BY rowid DESC for
deterministic latest-wins behavior.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Queue scan exports back-to-back: when an export is running, new
batches are queued and drain automatically on completion. Each batch
snapshots its state (file path, jobs, settings) so the user can
switch videos while exports run.
Also updates ScanWorker default and slider initial value to 0.50
to match the core threshold change.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set CQ/QP rate control (quality 28) for NVENC, VAAPI, QSV, and AMF
hardware encoders instead of relying on encoder defaults which
produce unnecessarily large files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Manual clips now follow the same pattern as scan exports:
clip_003_m1_0.mp4 (manual) vs clip_003_a1_0.mp4 (auto-scan).
The clip number matches the vid folder number.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
clip_001_a1_0 now matches vid_001 instead of using an independent
counter that created confusing double numbering.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9 tasks covering node pack skeleton, all 5 nodes, frontend widget,
API routes, and integration testing. Uses ExecutionBlocker pattern
for the interactive VideoReview node.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
load_for_file and add_scan_results triggered N redundant timeline repaints
via tab_changed → _on_scan_regions_edited for each tab add/remove.
blockSignals(True) during programmatic tab operations eliminates the cascade.
Also adds EAT_LARGE embedding model (1024-dim) and updates design docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Export layout changed from clip_NNN group dirs to vid_NNN per-video folders
- Automatic DB migration rewrites old paths and moves files on startup
- Per-video counter with DB cross-check to prevent overwrites
- Changelog popup on version bump with "don't show again" checkbox
- Scan region resize now requires Shift+drag to prevent accidental edits
- Recalculate vid folder and counter on file load
- Add EAT_LARGE embedding model variant
- Update tests for new flat export path structure
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pip install -r requirements.txt can pull CPU-only torchvision via
transitive dependencies (timm, ultralytics). Adding --extra-index-url
with the CUDA wheel index ensures all torch packages stay on the
correct build. Applied to both Linux and Windows setup scripts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
tab_changed was only updating export count, not the timeline overlay.
Now calls _on_scan_regions_edited which refreshes both.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EAT remote model code (worstchan/EAT-base_epoch30_finetune_AS2M) is
incompatible with transformers 5.x — missing all_tied_weights_keys
attribute added in the v5 PreTrainedModel API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the "skip if torch exists" guard so re-running the setup script
fixes a broken torchvision install.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
timm and ultralytics depend on torchvision. When pip install -r
requirements.txt resolves them, it pulls torchvision from PyPI (CPU
build) which is incompatible with CUDA torch, causing
"operator torchvision::nms does not exist" at import time.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use microsecond-precision timestamps to prevent version merging on
sub-second scans
- Clear undo stack when switching scan versions (stale row references)
- Parse timestamp labels robustly instead of hard-coded string slicing
- "Clear All" in hard negatives dialog respects active model filter
- Remove time.sleep from tests (no longer needed with microsecond timestamps)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New HardNegativesDialog shows all hard negatives in a table with model
filter dropdown, multi-select delete, and clear all. Accessible from
TrainDialog via "Manage..." button next to the hard negatives checkbox.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add source_model column to hard_negatives table with migration. New
get_hard_negatives() returns full rows, delete_hard_negatives_by_ids()
for bulk deletion. get_training_data() gains use_hard_negatives param.
TrainDialog has "Use hard negatives" checkbox. Scan panel passes current
model name when marking negatives.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Each model tab now has a version combo showing scan history. When multiple
versions exist for a (file, model), users can switch between them to
compare results across training iterations. Added _current_table() and
_tab_table() helpers to unwrap the new container→table widget hierarchy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add scan_timestamp column to scan_results. save_scan_results now inserts
with a timestamp and prunes versions beyond max_versions (default 5).
get_scan_results returns only the latest version by default, with optional
scan_timestamp parameter for loading specific versions. New get_scan_versions
method returns available versions for a (file, profile, model) tuple.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ghost folders (scan-export-only) no longer appear in training dropdowns.
Also filters out 0-clip folders from get_training_stats.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move waveforms creation inside the else branch so AST and EAT
models (which have their own preprocessing) don't waste GPU memory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers scan result versioning per model, hard negative management
dialog with training toggle, and ghost folder fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Calibration: cv=min(3,n_pos,n_neg_sample) could yield cv=1 (ValueError);
replaced with min_class >= 6 guard to skip calibration for tiny datasets
- AST: clarified chunks are already numpy arrays, use list(chunks) directly
- EAT: noted extract_features returns plain tensor (not tuple)
- Multi-layer: explicit notes on _w2v_model_name storing base name,
ml_cfg needed in _extract_w2v_targeted, embeddings_list vs embeddings
- Added AST to _ml_config layer_counts upfront in Task 2
- Added integration test for model switching (no-reload verification)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Training cancel: connect finished signal to re-enable button (was stuck disabled)
- Waveform worker: disconnect stale signal and wait on file switch, clean up on close
- DatasetStatsDialog: numeric sort via DisplayRole, remove dead widget allocation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Details button in Train dialog opens a stats view showing:
- Class totals (positive/soft/negative) with colored balance bar
- Per-video table sortable by column
- Warnings for low clip counts, class imbalance, negative-only videos
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- WaveformWorker extracts low-res audio envelope via ffmpeg, drawn as
green polygon on timeline track
- _safe_disconnect() replaces bare TypeError catches for signal cleanup
- Train button toggles to Cancel during training, calls worker.cancel()
- Dynamic GPU batch sizing: 64 for ≥16GB VRAM, 32 for ≥8GB, 16 default
- Overlap warning before exporting clips that intersect existing markers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fresh databases were missing scan_export column — broke first export
- Threshold slider now filters existing scan results without rescanning
- N key toggles hard negative on selected scan regions
- All 59 tests passing
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- setup-windows.ps1 and setup_env.sh detect nvidia-smi for CUDA vs CPU PyTorch
- Startup logs Python version, venv path, PyTorch/CUDA/GPU, scikit-learn, librosa
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Prefetch next video's audio while GPU processes current embeddings
- Don't cancel Scan All when switching files in playlist
- Windows setup script now creates venv, installs PyTorch + requirements
- 8cut.bat auto-detects venv
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Show bright green playback position line in review mode
- Model history button next to scan model dropdown
- Skip backup on restore if identical timestamped copy already exists
- Auto-rescan when restoring a model version
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Scan regions can be disabled (Del/Backspace) instead of deleted, shown greyed out
- Resize scan regions by dragging timeline edges or editing table cells
- Grey ghost overlay shows trimmed portions of resized regions
- Ctrl+Z undo for disable, resize, drag, and negative toggle actions
- Fix training stats including scan-exported clips when checkbox unchecked
- Switch classifier to HistGradientBoostingClassifier (multi-threaded)
- Timestamped model saves with latest copy at base path
- Fix next-folder counter not detecting scan export folders
- Each scan area exports to its own numbered clip folder
- Platform-aware HW encoder detection (Linux/Windows/macOS)
- Auto-detect VAAPI render device instead of hardcoding
- Use shutil.move for cross-drive safety on Windows
- Comprehensive README rewrite with scan workflow documentation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fuse overlapping scan regions before display (merge adjacent 1s-hop windows)
- Hard negatives: mark false positives from scan panel for training feedback
- Toggle with "Add to Negatives" button, red text + red timeline regions
- Stored in dedicated hard_negatives table, always included in training
- Model versioning: auto-backup on retrain, right-click model combo to rollback
- Scan review mode: "Review" toggle hides spread/markers for free navigation
- Scan exports: saved to DB with scan_export flag, no timeline markers
- Training dialog checkbox to optionally include scan exports
- Single group folder per batch with area numbering (clip_042_a1_0.mp4)
- Export scan results: skip negatives, skip regions < 8s, respect spread
- Button shows estimated clip count, updates on spread/fuse/negative changes
- Timeline: reload scan regions on file load, "Clear all markers" context menu
- Default training model changed to HUBERT_XLARGE
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>