8-cut

Files

T

Ethanfel f2c38aee79 feat: rewrite audio scan with MFCC+delta+spectral contrast pipeline

Root cause of poor discrimination: MFCC[0] (energy) dominated the
feature vector, making cosine similarity see all audio as similar.

Changes:
- Skip MFCC[0], use 12 coefficients instead of 20
- Add delta MFCCs for temporal dynamics
- Add 7-band spectral contrast for tonal vs noise quality
- Switch from cosine similarity to euclidean-distance-based score
- Pre-compute STFT once for whole file (10-20x faster)
- Vectorized sliding window via cumulative sums (no Python loop)
- Lower sample rate 22050→16000 Hz (faster, no quality loss)
- 62-dim feature vector (was 40-dim mean+std of raw MFCCs)
- Default threshold 0.05 (new similarity scale)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-17 15:28:44 +02:00

__init__.py

feat: project skeleton

2026-04-06 11:11:25 +02:00

test_audio_scan.py

feat: rewrite audio scan with MFCC+delta+spectral contrast pipeline

2026-04-17 15:28:44 +02:00

test_utils.py

feat: add apply_keyframes_to_jobs helper

2026-04-14 15:57:27 +02:00