Commit Graph

19 Commits

Author SHA1 Message Date
Ethanfel bc4ae21153 feat: color exported scan result rows green
Scan panel rows whose range contains an exported clip's start time
are colored green. Priority: disabled > negative > exported > default.
Exported state refreshes automatically after an auto-export batch
completes on the current file.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-21 12:50:12 +02:00
Ethanfel 4d99cf6015 feat: scan exports replace existing DB entries instead of accumulating
When starting a scan export batch, delete old scan_export entries for
the same file+profile before writing new ones. Logs a warning when
replacing. Prevents stale entry buildup from repeated scan exports.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 11:08:17 +02:00
Ethanfel b75fa85ff5 fix: vid counter reuse and non-deterministic lookup in get_vid_folder
Two bugs caused vid number collisions (multiple files sharing a vid_NNN):

1. "First gap" assignment (n=1; while vid_n in existing: n++) would
   reuse deleted vid numbers. Changed to max(existing) + 1 so numbers
   always increase.

2. LIMIT 1 without ORDER BY returned arbitrary rows when a file had
   entries in multiple vid folders. Added ORDER BY rowid DESC for
   deterministic latest-wins behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 11:00:57 +02:00
Ethanfel 6c1d42adfe feat: vid folder layout, changelog popup, shift-to-resize, DB migration
- Export layout changed from clip_NNN group dirs to vid_NNN per-video folders
- Automatic DB migration rewrites old paths and moves files on startup
- Per-video counter with DB cross-check to prevent overwrites
- Changelog popup on version bump with "don't show again" checkbox
- Scan region resize now requires Shift+drag to prevent accidental edits
- Recalculate vid folder and counter on file load
- Add EAT_LARGE embedding model variant
- Update tests for new flat export path structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 17:01:37 +02:00
Ethanfel 5d45b8d8eb fix: timestamp collision, undo stack invalidation, label parsing, filter-aware clear
- Use microsecond-precision timestamps to prevent version merging on
  sub-second scans
- Clear undo stack when switching scan versions (stale row references)
- Parse timestamp labels robustly instead of hard-coded string slicing
- "Clear All" in hard negatives dialog respects active model filter
- Remove time.sleep from tests (no longer needed with microsecond timestamps)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 15:36:31 +02:00
Ethanfel edc5784ba6 feat: hard negative source_model tracking, training toggle
Add source_model column to hard_negatives table with migration. New
get_hard_negatives() returns full rows, delete_hard_negatives_by_ids()
for bulk deletion. get_training_data() gains use_hard_negatives param.
TrainDialog has "Use hard negatives" checkbox. Scan panel passes current
model name when marking negatives.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 15:27:11 +02:00
Ethanfel 4fb2ae144f feat: scan result history — keep N versions per (file, model)
Add scan_timestamp column to scan_results. save_scan_results now inserts
with a timestamp and prunes versions beyond max_versions (default 5).
get_scan_results returns only the latest version by default, with optional
scan_timestamp parameter for loading specific versions. New get_scan_versions
method returns available versions for a (file, profile, model) tuple.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 15:18:28 +02:00
Ethanfel 2614a765d5 fix: get_export_folders respects scan_export filter
Ghost folders (scan-export-only) no longer appear in training dropdowns.
Also filters out 0-clip folders from get_training_stats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 15:16:49 +02:00
Ethanfel 2b7dfb330d fix: DB schema missing scan_export column, add threshold filter and N hotkey
- Fresh databases were missing scan_export column — broke first export
- Threshold slider now filters existing scan results without rescanning
- N key toggles hard negative on selected scan regions
- All 59 tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-19 12:45:14 +02:00
Ethanfel 6ddfcde8ee feat: disable/resize scan regions, undo, training fixes, cross-platform cleanup
- Scan regions can be disabled (Del/Backspace) instead of deleted, shown greyed out
- Resize scan regions by dragging timeline edges or editing table cells
- Grey ghost overlay shows trimmed portions of resized regions
- Ctrl+Z undo for disable, resize, drag, and negative toggle actions
- Fix training stats including scan-exported clips when checkbox unchecked
- Switch classifier to HistGradientBoostingClassifier (multi-threaded)
- Timestamped model saves with latest copy at base path
- Fix next-folder counter not detecting scan export folders
- Each scan area exports to its own numbered clip folder
- Platform-aware HW encoder detection (Linux/Windows/macOS)
- Auto-detect VAAPI render device instead of hardcoding
- Use shutil.move for cross-drive safety on Windows
- Comprehensive README rewrite with scan workflow documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 20:34:56 +02:00
Ethanfel b161412d94 feat: scan workflow — region fusion, hard negatives, review mode, versioned models
- Fuse overlapping scan regions before display (merge adjacent 1s-hop windows)
- Hard negatives: mark false positives from scan panel for training feedback
  - Toggle with "Add to Negatives" button, red text + red timeline regions
  - Stored in dedicated hard_negatives table, always included in training
- Model versioning: auto-backup on retrain, right-click model combo to rollback
- Scan review mode: "Review" toggle hides spread/markers for free navigation
- Scan exports: saved to DB with scan_export flag, no timeline markers
  - Training dialog checkbox to optionally include scan exports
  - Single group folder per batch with area numbering (clip_042_a1_0.mp4)
- Export scan results: skip negatives, skip regions < 8s, respect spread
  - Button shows estimated clip count, updates on spread/fuse/negative changes
- Timeline: reload scan regions on file load, "Clear all markers" context menu
- Default training model changed to HUBERT_XLARGE

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 18:43:05 +02:00
Ethanfel 5a9e068903 fix: 6 bugs — profile isolation, export stashing, auto-negative guard
- Stash profile and crop_center at export start for async safety
- Scope get_group/delete_group by profile to prevent cross-profile leaks
- Guard auto-negative sampling when no markers exist (prevents flood)
- Wrap ffmpeg subprocess with clean timeout error message
- Fix scan-all panel reload to use stashed profile, not live value
- Remove dead warnings import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 16:28:51 +02:00
Ethanfel 6870e5aaf3 feat: scan results panel, model switching, batch scan, and training improvements
- Replace librosa with direct ffmpeg subprocess for 10x faster audio loading
- Add ScanResultsPanel with per-model tabs, seek-on-click, delete, and export
- Persist scan results in DB per (filename, profile, model)
- Add model selector dropdown to switch between trained embedding models
- Add "Scan All" button for batch scanning playlist videos
- Support manual negative examples via negative class folder
- Configurable auto-negative margin (default 30s, 0 = disabled)
- Deduplicate nearby training markers (8s min gap)
- Parallel audio loading with ThreadPoolExecutor during training
- Progress callbacks from training for UI status updates
- Cache bypass in scan_video (skip audio loading when embeddings cached)
- Move all caches (models, embeddings, downloads) into project directory
- Add 8cut.sh launcher script with auto venv/conda detection
- Fix 11 bugs across thread safety, signal handling, and state management

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 16:12:52 +02:00
Ethanfel e1789d4e71 fix: bug audit — broken test imports, training data overlap, cleanup
- Fix test_utils.py importing build_annotation_json_path from main
  instead of core.annotations (all 59 tests pass now)
- Fix get_training_data double-counting clips at same start_time
  in both positive and soft sets — subtract positive from soft
- Add cancel_flag to train_classifier so training can be interrupted
  between videos (TrainWorker passes self as cancel_flag)
- Remove orphaned core/export.py (was for deleted server API)
- Remove stale Dockerfile and docker-compose.yml (referenced server)
- Clean up leftover server/__pycache__ and client/ build artifacts
- Add torch to requirements.txt (was only mentioned in comments)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 12:55:58 +02:00
Ethanfel 12ed183f1b feat: integrate training UI, BEATs model, and clean up legacy code
- Remove legacy distance-mode scanning (build_profile, _similarity, etc.)
  and hand-crafted intensity features — pipeline is now embedding-only
- Integrate Microsoft BEATs as embedding option alongside wav2vec2/HuBERT
- Add TrainDialog with positive class selector, model picker, video dir
  fallback, and live training stats
- Add TrainWorker QThread with cancel support and proper lifecycle cleanup
- Add source_path column to DB for robust source video tracking
- Add get_export_folders/get_training_data/get_training_stats to DB
- Wire source_path in all export DB writes (_on_clip_done, _on_auto_clip_done)
- Cancel scan/train workers in closeEvent to prevent use-after-free crashes
- Add setup_env.sh supporting both conda and python venv (CUDA 12.8)
- Update requirements.txt with all actual dependencies
- Update 8cut_train.py with --positive flag for new DB-driven training

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 11:52:27 +02:00
Ethanfel fd42791c9f feat: add get_all_export_paths to ProcessedDB
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 08:55:39 +02:00
Ethanfel 39f873bec2 fix: server bug fixes from review
- DB: add threading.Lock on all write methods and multi-step reads
- export.py: check audio extraction return code, raise on failure
- routes/export: counter race condition fix with _counter_lock
- routes/export: delete validation accepts EXPORT_DIR_suffix siblings
- routes/export: evict old finished jobs to prevent unbounded growth
- client plan: fix 10 bugs (mpv IPC, encodePath, input_path sep, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:53:38 +02:00
Ethanfel 5b7a55a05d fix: second-pass review bugs in server and core
- ExportRunner: stop batch on first error (was continuing, overwriting
  error status with done)
- Export route: validate input_path against MEDIA_DIRS
- Export route: validate encoder, portrait_ratio, folder_suffix, name
- Export route: fix format check for WebP sequence
- Export route: add _ separator in folder_suffix (match GUI)
- Export route: use realpath consistently in delete endpoint
- Export route: drop runner ref on completion (prevent memory leak)
- ProcessedDB: use cursor-level row_factory (thread-safe)
- WebSocket: catch all exceptions in connect, cleanup in finally
- Dockerfile: use uvicorn[standard] for websockets support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 14:10:27 +02:00
Ethanfel 72f6a4e8f5 feat: create core/db module with ProcessedDB
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 13:38:20 +02:00