LoRA quality improvements addressing intruder dimension problem:
1. PiSSA initialization (arXiv:2404.02948): init A,B from top-r SVD of
pretrained weight. Starts on-manifold, eliminates intruder dimensions
at init. Base weight stores residual W_res = W - B@A*scale.
2. rsLoRA scaling (arXiv:2312.03732): alpha/sqrt(rank) instead of
alpha/rank. Prevents gradient collapse at high ranks (128+).
3. Post-training Spectral Surgery (arXiv:2603.03995): SVD of trained
LoRA update, gradient-sensitivity reweighting to suppress remaining
intruder dimensions. Runs automatically after training completes.
4. alpha default changed to 2*rank (was 1*rank). Produces fewer intruder
dimensions per arXiv:2410.21228.
5. weight_decay reduced from 1e-2 to 0.0 (standard for LoRA, prevents
erasing learned style weights).
6. random.choices replaced with random.sample when batch_size <= dataset
size (eliminates duplicate samples per batch).
PiSSA checkpoints include base weights (residual). Loader/evaluator
updated to handle both standard and PiSSA checkpoint formats.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- _eval_sample gains clip_idx param (default 0, backward compatible)
- Evaluator loops over all dataset clips per adapter, saves one WAV per clip
- Reference metrics computed for all clips and averaged
- Comparison chart and summary use avg_metrics across all clips
- Eliminates bias from evaluating on an unrepresentative single clip
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
shutil.copy2 was writing FLAC binary to reference.wav — unplayable.
Now copies as reference{.flac/.wav/etc} matching the source extension.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Loads the first clip's original audio (same clip used for inference),
copies it to output_dir/reference.wav, runs spectral metrics and
saves a spectrogram. Appears first in the comparison chart so generated
samples can be judged against the target sound.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Generates audio samples from a list of adapters against a fixed reference
clip, collects spectral metrics for each, and outputs a comparison bar
chart + eval_summary.json. Useful for comparing sweep candidates before
committing to a next round of training.
JSON format: name, data_dir, output_dir, steps, seed, adapters[{id, path}].
Empty path = baseline (no LoRA).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>