8-cut

Author	SHA1	Message	Date
Ethanfel	cb2060beb8	docs: add ComfyUI-8cut implementation plan 9 tasks covering node pack skeleton, all 5 nodes, frontend widget, API routes, and integration testing. Uses ExecutionBlocker pattern for the interactive VideoReview node. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 19:44:49 +02:00
Ethanfel	0db412baf4	docs: add ComfyUI-8cut node pack design Tensor-free video scanning workflow for remote browser access. 5 nodes (LoadVideo, AudioScan, VideoReview, TrainModel, ExportClips) with custom types passing file paths instead of image tensors. Reuses entire core/ package unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 19:41:17 +02:00
Ethanfel	876026d1f6	fix: block spurious tab signals during scan panel load to prevent slow file switching load_for_file and add_scan_results triggered N redundant timeline repaints via tab_changed → _on_scan_regions_edited for each tab add/remove. blockSignals(True) during programmatic tab operations eliminates the cascade. Also adds EAT_LARGE embedding model (1024-dim) and updates design docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 19:06:26 +02:00
Ethanfel	6c1d42adfe	feat: vid folder layout, changelog popup, shift-to-resize, DB migration - Export layout changed from clip_NNN group dirs to vid_NNN per-video folders - Automatic DB migration rewrites old paths and moves files on startup - Per-video counter with DB cross-check to prevent overwrites - Changelog popup on version bump with "don't show again" checkbox - Scan region resize now requires Shift+drag to prevent accidental edits - Recalculate vid folder and counter on file load - Add EAT_LARGE embedding model variant - Update tests for new flat export path structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 17:01:37 +02:00
Ethanfel	d8b3972bdc	fix: ensure setup scripts use correct PyTorch index for transitive deps pip install -r requirements.txt can pull CPU-only torchvision via transitive dependencies (timm, ultralytics). Adding --extra-index-url with the CUDA wheel index ensures all torch packages stay on the correct build. Applied to both Linux and Windows setup scripts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:21:36 +02:00
Ethanfel	bd345abca2	fix: refresh timeline scan regions when switching model tabs tab_changed was only updating export count, not the timeline overlay. Now calls _on_scan_regions_edited which refreshes both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:16:12 +02:00
Ethanfel	7d6fee9df1	fix: copy read-only numpy array before torch conversion in EAT preprocessing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:13:34 +02:00
Ethanfel	fd043f4172	fix: pin transformers<5.0 for EAT model compatibility EAT remote model code (worstchan/EAT-base_epoch30_finetune_AS2M) is incompatible with transformers 5.x — missing all_tied_weights_keys attribute added in the v5 PreTrainedModel API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:11:18 +02:00
Ethanfel	3c3b1d74bb	fix: always reinstall torch stack on Windows re-runs Remove the "skip if torch exists" guard so re-running the setup script fixes a broken torchvision install. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:09:44 +02:00
Ethanfel	a3c657c66e	fix: install torchvision from CUDA index to prevent ABI mismatch timm and ultralytics depend on torchvision. When pip install -r requirements.txt resolves them, it pulls torchvision from PyPI (CPU build) which is incompatible with CUDA torch, causing "operator torchvision::nms does not exist" at import time. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 16:08:35 +02:00
Ethanfel	5d45b8d8eb	fix: timestamp collision, undo stack invalidation, label parsing, filter-aware clear - Use microsecond-precision timestamps to prevent version merging on sub-second scans - Clear undo stack when switching scan versions (stale row references) - Parse timestamp labels robustly instead of hard-coded string slicing - "Clear All" in hard negatives dialog respects active model filter - Remove time.sleep from tests (no longer needed with microsecond timestamps) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:36:31 +02:00
Ethanfel	e6db83f00b	feat: hard negatives management dialog with filter and bulk delete New HardNegativesDialog shows all hard negatives in a table with model filter dropdown, multi-select delete, and clear all. Accessible from TrainDialog via "Manage..." button next to the hard negatives checkbox. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:28:18 +02:00
Ethanfel	edc5784ba6	feat: hard negative source_model tracking, training toggle Add source_model column to hard_negatives table with migration. New get_hard_negatives() returns full rows, delete_hard_negatives_by_ids() for bulk deletion. get_training_data() gains use_hard_negatives param. TrainDialog has "Use hard negatives" checkbox. Scan panel passes current model name when marking negatives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:27:11 +02:00
Ethanfel	8ed9fbf557	feat: scan version selector in results panel Each model tab now has a version combo showing scan history. When multiple versions exist for a (file, model), users can switch between them to compare results across training iterations. Added _current_table() and _tab_table() helpers to unwrap the new container→table widget hierarchy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:22:46 +02:00
Ethanfel	4fb2ae144f	feat: scan result history — keep N versions per (file, model) Add scan_timestamp column to scan_results. save_scan_results now inserts with a timestamp and prunes versions beyond max_versions (default 5). get_scan_results returns only the latest version by default, with optional scan_timestamp parameter for loading specific versions. New get_scan_versions method returns available versions for a (file, profile, model) tuple. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:18:28 +02:00
Ethanfel	2614a765d5	fix: get_export_folders respects scan_export filter Ghost folders (scan-export-only) no longer appear in training dropdowns. Also filters out 0-clip folders from get_training_stats. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 15:16:49 +02:00
Ethanfel	c020c0dfec	fix: avoid unnecessary GPU tensor allocation for AST/EAT models Move waveforms creation inside the else branch so AST and EAT models (which have their own preprocessing) don't waste GPU memory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 14:53:05 +02:00
Ethanfel	e7b791fbfa	docs: add scan history & hard negative management design + plan Covers scan result versioning per model, hard negative management dialog with training toggle, and ghost folder fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 14:51:17 +02:00
Ethanfel	f5361a963e	feat: calibrate classifier probabilities with isotonic regression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 14:00:38 +02:00
Ethanfel	8fb8581816	feat: add EAT (Efficient Audio Transformer) embedding model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 14:00:09 +02:00
Ethanfel	5b25e85e98	feat: add AST (Audio Spectrogram Transformer) embedding model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:55:29 +02:00
Ethanfel	e3f133ef84	feat: multi-layer extraction for HuBERT/Wav2Vec2 models Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:53:55 +02:00
Ethanfel	4736f150b0	deps: add transformers and timm for AST/EAT models Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:52:19 +02:00
Ethanfel	52aa982aa2	docs: fix bugs in audio pipeline implementation plan - Calibration: cv=min(3,n_pos,n_neg_sample) could yield cv=1 (ValueError); replaced with min_class >= 6 guard to skip calibration for tiny datasets - AST: clarified chunks are already numpy arrays, use list(chunks) directly - EAT: noted extract_features returns plain tensor (not tuple) - Multi-layer: explicit notes on _w2v_model_name storing base name, ml_cfg needed in _extract_w2v_targeted, embeddings_list vs embeddings - Added AST to _ml_config layer_counts upfront in Task 2 - Added integration test for model switching (no-reload verification) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:41:42 +02:00
Ethanfel	07457d0d6f	docs: audio pipeline improvements implementation plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:30:38 +02:00
Ethanfel	c5d613fc5f	docs: audio pipeline improvements design — multi-layer, AST, EAT, calibration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:28:32 +02:00
Ethanfel	7855ea62c2	fix: training cancel button re-enable, waveform worker cleanup, stats table sort - Training cancel: connect finished signal to re-enable button (was stuck disabled) - Waveform worker: disconnect stale signal and wait on file switch, clean up on close - DatasetStatsDialog: numeric sort via DisplayRole, remove dead widget allocation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 13:03:14 +02:00
Ethanfel	70be5974cf	feat: dataset statistics dialog with per-video breakdown and class balance Details button in Train dialog opens a stats view showing: - Class totals (positive/soft/negative) with colored balance bar - Per-video table sortable by column - Warnings for low clip counts, class imbalance, negative-only videos Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 12:55:42 +02:00
Ethanfel	a0286d5cf9	feat: waveform overlay, signal safety, training cancel, dynamic batch size, duplicate detection - WaveformWorker extracts low-res audio envelope via ffmpeg, drawn as green polygon on timeline track - _safe_disconnect() replaces bare TypeError catches for signal cleanup - Train button toggles to Cancel during training, calls worker.cancel() - Dynamic GPU batch sizing: 64 for ≥16GB VRAM, 32 for ≥8GB, 16 default - Overlap warning before exporting clips that intersect existing markers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 12:53:48 +02:00
Ethanfel	2b7dfb330d	fix: DB schema missing scan_export column, add threshold filter and N hotkey - Fresh databases were missing scan_export column — broke first export - Threshold slider now filters existing scan results without rescanning - N key toggles hard negative on selected scan regions - All 59 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-19 12:45:14 +02:00
Ethanfel	518554f788	fix: keep PowerShell window open after setup completes or errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 22:19:49 +02:00
Ethanfel	282156e8ed	feat: auto-detect GPU in setup scripts, log environment at startup - setup-windows.ps1 and setup_env.sh detect nvidia-smi for CUDA vs CPU PyTorch - Startup logs Python version, venv path, PyTorch/CUDA/GPU, scikit-learn, librosa Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 22:12:45 +02:00
Ethanfel	3417a0f603	fix: crash when switching folder in train dialog (signal recursion) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 22:00:23 +02:00
Ethanfel	cd0552197f	feat: prefetch audio during Scan All, fix file-switch interruption, fix Windows setup - Prefetch next video's audio while GPU processes current embeddings - Don't cancel Scan All when switching files in playlist - Windows setup script now creates venv, installs PyTorch + requirements - 8cut.bat auto-detects venv Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 21:50:33 +02:00
Ethanfel	7dffcb08eb	feat: interruptible Scan All — stop after current video, resume later Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 21:37:47 +02:00
Ethanfel	93bcb23fa7	docs: document embedding cache and fast rescan loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 21:26:14 +02:00
Ethanfel	eda7826a40	fix: safe PATH fallback for Windows DLL loading, deduplicate model restore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 21:14:44 +02:00
Ethanfel	e7e20b0fe6	fix: review mode playback line, model restore dedup, auto-rescan on rollback - Show bright green playback position line in review mode - Model history button next to scan model dropdown - Skip backup on restore if identical timestamped copy already exists - Auto-rescan when restoring a model version Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 21:05:40 +02:00
Ethanfel	814ef946eb	fix: add missing shortcuts to help dialog (disable, undo, drag resize) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 20:47:52 +02:00
Ethanfel	2e738df9ae	docs: rewrite install guide with venv steps and dataset sizing guidance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 20:40:23 +02:00
Ethanfel	6ddfcde8ee	feat: disable/resize scan regions, undo, training fixes, cross-platform cleanup - Scan regions can be disabled (Del/Backspace) instead of deleted, shown greyed out - Resize scan regions by dragging timeline edges or editing table cells - Grey ghost overlay shows trimmed portions of resized regions - Ctrl+Z undo for disable, resize, drag, and negative toggle actions - Fix training stats including scan-exported clips when checkbox unchecked - Switch classifier to HistGradientBoostingClassifier (multi-threaded) - Timestamped model saves with latest copy at base path - Fix next-folder counter not detecting scan export folders - Each scan area exports to its own numbered clip folder - Platform-aware HW encoder detection (Linux/Windows/macOS) - Auto-detect VAAPI render device instead of hardcoding - Use shutil.move for cross-drive safety on Windows - Comprehensive README rewrite with scan workflow documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 20:34:56 +02:00
Ethanfel	b161412d94	feat: scan workflow — region fusion, hard negatives, review mode, versioned models - Fuse overlapping scan regions before display (merge adjacent 1s-hop windows) - Hard negatives: mark false positives from scan panel for training feedback - Toggle with "Add to Negatives" button, red text + red timeline regions - Stored in dedicated hard_negatives table, always included in training - Model versioning: auto-backup on retrain, right-click model combo to rollback - Scan review mode: "Review" toggle hides spread/markers for free navigation - Scan exports: saved to DB with scan_export flag, no timeline markers - Training dialog checkbox to optionally include scan exports - Single group folder per batch with area numbering (clip_042_a1_0.mp4) - Export scan results: skip negatives, skip regions < 8s, respect spread - Button shows estimated clip count, updates on spread/fuse/negative changes - Timeline: reload scan regions on file load, "Clear all markers" context menu - Default training model changed to HUBERT_XLARGE Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 18:43:05 +02:00
Ethanfel	5a9e068903	fix: 6 bugs — profile isolation, export stashing, auto-negative guard - Stash profile and crop_center at export start for async safety - Scope get_group/delete_group by profile to prevent cross-profile leaks - Guard auto-negative sampling when no markers exist (prevents flood) - Wrap ffmpeg subprocess with clean timeout error message - Fix scan-all panel reload to use stashed profile, not live value - Remove dead warnings import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 16:28:51 +02:00
Ethanfel	6870e5aaf3	feat: scan results panel, model switching, batch scan, and training improvements - Replace librosa with direct ffmpeg subprocess for 10x faster audio loading - Add ScanResultsPanel with per-model tabs, seek-on-click, delete, and export - Persist scan results in DB per (filename, profile, model) - Add model selector dropdown to switch between trained embedding models - Add "Scan All" button for batch scanning playlist videos - Support manual negative examples via negative class folder - Configurable auto-negative margin (default 30s, 0 = disabled) - Deduplicate nearby training markers (8s min gap) - Parallel audio loading with ThreadPoolExecutor during training - Progress callbacks from training for UI status updates - Cache bypass in scan_video (skip audio loading when embeddings cached) - Move all caches (models, embeddings, downloads) into project directory - Add 8cut.sh launcher script with auto venv/conda detection - Fix 11 bugs across thread safety, signal handling, and state management Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 16:12:52 +02:00
Ethanfel	f597ff29e8	chore: move model storage into project models/ directory Models now live in <project>/models/ instead of ~/.8cut_models/ so everything stays self-contained. Updated .gitignore to exclude models/, .venv/, .joblib, and .pt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 13:05:20 +02:00
Ethanfel	e1789d4e71	fix: bug audit — broken test imports, training data overlap, cleanup - Fix test_utils.py importing build_annotation_json_path from main instead of core.annotations (all 59 tests pass now) - Fix get_training_data double-counting clips at same start_time in both positive and soft sets — subtract positive from soft - Add cancel_flag to train_classifier so training can be interrupted between videos (TrainWorker passes self as cancel_flag) - Remove orphaned core/export.py (was for deleted server API) - Remove stale Dockerfile and docker-compose.yml (referenced server) - Clean up leftover server/__pycache__ and client/ build artifacts - Add torch to requirements.txt (was only mentioned in comments) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 12:55:58 +02:00
Ethanfel	7834b1d05c	chore: remove server and client — unused in desktop app Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 12:49:20 +02:00
Ethanfel	12ed183f1b	feat: integrate training UI, BEATs model, and clean up legacy code - Remove legacy distance-mode scanning (build_profile, _similarity, etc.) and hand-crafted intensity features — pipeline is now embedding-only - Integrate Microsoft BEATs as embedding option alongside wav2vec2/HuBERT - Add TrainDialog with positive class selector, model picker, video dir fallback, and live training stats - Add TrainWorker QThread with cancel support and proper lifecycle cleanup - Add source_path column to DB for robust source video tracking - Add get_export_folders/get_training_data/get_training_stats to DB - Wire source_path in all export DB writes (_on_clip_done, _on_auto_clip_done) - Cancel scan/train workers in closeEvent to prevent use-after-free crashes - Add setup_env.sh supporting both conda and python venv (CUDA 12.8) - Update requirements.txt with all actual dependencies - Update 8cut_train.py with --positive flag for new DB-driven training Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-18 11:52:27 +02:00
Ethanfel	f2c38aee79	feat: rewrite audio scan with MFCC+delta+spectral contrast pipeline Root cause of poor discrimination: MFCC[0] (energy) dominated the feature vector, making cosine similarity see all audio as similar. Changes: - Skip MFCC[0], use 12 coefficients instead of 20 - Add delta MFCCs for temporal dynamics - Add 7-band spectral contrast for tonal vs noise quality - Switch from cosine similarity to euclidean-distance-based score - Pre-compute STFT once for whole file (10-20x faster) - Vectorized sliding window via cumulative sums (no Python loop) - Lower sample rate 22050→16000 Hz (faster, no quality loss) - 62-dim feature vector (was 40-dim mean+std of raw MFCCs) - Default threshold 0.05 (new similarity scale) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-17 15:28:44 +02:00
Ethanfel	8ab5bdba77	fix: use mean+std MFCC vectors (40-dim) for better discrimination Mean-only vectors were too similar across different audio segments, causing everything to match even at threshold 0.99. Adding std captures temporal dynamics and makes the similarity scores much more spread out. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-17 09:27:11 +02:00

1 2 3 4 5 ...

401 Commits