Models now live in <project>/models/ instead of ~/.8cut_models/ so
everything stays self-contained. Updated .gitignore to exclude
models/, .venv/, *.joblib, and *.pt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix test_utils.py importing build_annotation_json_path from main
instead of core.annotations (all 59 tests pass now)
- Fix get_training_data double-counting clips at same start_time
in both positive and soft sets — subtract positive from soft
- Add cancel_flag to train_classifier so training can be interrupted
between videos (TrainWorker passes self as cancel_flag)
- Remove orphaned core/export.py (was for deleted server API)
- Remove stale Dockerfile and docker-compose.yml (referenced server)
- Clean up leftover server/__pycache__ and client/ build artifacts
- Add torch to requirements.txt (was only mentioned in comments)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove legacy distance-mode scanning (build_profile, _similarity, etc.)
and hand-crafted intensity features — pipeline is now embedding-only
- Integrate Microsoft BEATs as embedding option alongside wav2vec2/HuBERT
- Add TrainDialog with positive class selector, model picker, video dir
fallback, and live training stats
- Add TrainWorker QThread with cancel support and proper lifecycle cleanup
- Add source_path column to DB for robust source video tracking
- Add get_export_folders/get_training_data/get_training_stats to DB
- Wire source_path in all export DB writes (_on_clip_done, _on_auto_clip_done)
- Cancel scan/train workers in closeEvent to prevent use-after-free crashes
- Add setup_env.sh supporting both conda and python venv (CUDA 12.8)
- Update requirements.txt with all actual dependencies
- Update 8cut_train.py with --positive flag for new DB-driven training
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Root cause of poor discrimination: MFCC[0] (energy) dominated the
feature vector, making cosine similarity see all audio as similar.
Changes:
- Skip MFCC[0], use 12 coefficients instead of 20
- Add delta MFCCs for temporal dynamics
- Add 7-band spectral contrast for tonal vs noise quality
- Switch from cosine similarity to euclidean-distance-based score
- Pre-compute STFT once for whole file (10-20x faster)
- Vectorized sliding window via cumulative sums (no Python loop)
- Lower sample rate 22050→16000 Hz (faster, no quality loss)
- 62-dim feature vector (was 40-dim mean+std of raw MFCCs)
- Default threshold 0.05 (new similarity scale)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mean-only vectors were too similar across different audio segments,
causing everything to match even at threshold 0.99. Adding std
captures temporal dynamics and makes the similarity scores much
more spread out.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>