Commit Graph

6 Commits

Author SHA1 Message Date
Ethanfel d70a4d2123 docs: add audio dataset pipeline implementation plan 2026-04-09 14:02:46 +02:00
Ethanfel 82fb7a0009 docs: note AudioX shows no perceptual quality gain on V2A vs SelVA
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 09:12:00 +02:00
Ethanfel af4777d2d7 docs: add AudioX vs SelVA evaluation
Architecture comparison, capability matrix, integration cost estimate,
LoRA training difficulty analysis, and license implications.
Verdict: SelVA remains preferred for V2A + LoRA fine-tuning; AudioX
adds value for music generation, inpainting, and text-to-audio tasks.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 09:11:09 +02:00
Ethanfel 21ed93d3ee docs: add audio dataset pipeline reference doc
Full research notes on cleaning, augmentation, and quality metrics for
generative model training. Covers LUFS normalization, AudioSep, waveform
augmentation (pitch shift, RIR, EQ), latent mixup, DNSMOS gating, tool
install commands, and key paper references.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 13:37:48 +02:00
Ethanfel 83b1da9520 chore: remove all PrismAudio code from main branch
- Delete prismaudio_core/, data_utils/, scripts/, docs/plans/
- Delete PrismAudio nodes (feature_extractor, feature_loader, model_loader, sampler, text_only)
- Delete PrismAudio workflows (video_to_audio, text_to_audio)
- Clean nodes/utils.py: rename PRISMAUDIO_CATEGORY → SELVA_CATEGORY, remove unused helpers
- Strip PrismAudio-only deps from requirements.txt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:58:31 +02:00
Ethanfel c9364c4ec2 docs: initial design and implementation plan 2026-03-27 16:57:15 +01:00