8-cut

Files

T

Ethanfel f2c38aee79 feat: rewrite audio scan with MFCC+delta+spectral contrast pipeline

Root cause of poor discrimination: MFCC[0] (energy) dominated the
feature vector, making cosine similarity see all audio as similar.

Changes:
- Skip MFCC[0], use 12 coefficients instead of 20
- Add delta MFCCs for temporal dynamics
- Add 7-band spectral contrast for tonal vs noise quality
- Switch from cosine similarity to euclidean-distance-based score
- Pre-compute STFT once for whole file (10-20x faster)
- Vectorized sliding window via cumulative sums (no Python loop)
- Lower sample rate 22050→16000 Hz (faster, no quality loss)
- 62-dim feature vector (was 40-dim mean+std of raw MFCCs)
- Default threshold 0.05 (new similarity scale)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-17 15:28:44 +02:00

__init__.py

feat: create core/paths module with shared path helpers

2026-04-16 13:34:17 +02:00

annotations.py

feat: create core/annotations module

2026-04-16 13:38:47 +02:00

audio_scan.py

feat: rewrite audio scan with MFCC+delta+spectral contrast pipeline

2026-04-17 15:28:44 +02:00

db.py

feat: add get_all_export_paths to ProcessedDB