Commit Graph

6 Commits

Author SHA1 Message Date
Ethanfel 09b3b94ddd feat: add batch_size parameter to training (default 4)
Replaces single-sample steps with batched sampling via random.choices().
Tensors are stacked to [B, T, C] before the forward pass; t is now [B].
Default grad_accum lowered to 1 since real batching gives stable gradients.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 23:36:12 +02:00
Ethanfel 4806daa4ca chore: lower default warmup_steps from 500 to 100
500 warmup steps is 25% of a 2000-step run — too long. 100 steps lets
the full lr kick in much earlier without sacrificing stability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 22:51:27 +02:00
Ethanfel 2f4641247a feat: add resume support to train_lora.py
Step checkpoints now save optimizer state, scheduler state, and step
number alongside the LoRA weights. Pass --resume path/to/adapter_stepXXXXX.pt
to continue training from that checkpoint. --steps always means total steps,
so resuming from 1000 with --steps 2000 trains 1000 more steps.

adapter_final.pt format is unchanged (state_dict + meta only) so
SelvaLoraLoader remains compatible.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 16:59:30 +02:00
Ethanfel 8e9114b92c docs: add clip length and scalable dataset size recommendations
- Clip length section: fixed 8s duration, padding/trim behavior, per-sound-type
  strategies (continuous, short events, repeating, onset placement).
- Dataset size table: 5-10 / 15-30 / 30-60 / 60-150 / 150-300 / 300+ clips
  with scenario and expected result for each tier.
- Note on diversity vs quantity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 16:34:50 +02:00
Ethanfel 63b4391573 fix: named .npz files always start at _001
dog_bark_001.npz, dog_bark_002.npz instead of dog_bark.npz, dog_bark_001.npz.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:44:26 +02:00
Ethanfel 89af5a468c docs: add LoRA training guide
Covers dataset preparation (ComfyUI feature extraction + clean audio),
training CLI reference, tuning guide (rank/steps/lr), adapter loading
in ComfyUI, and troubleshooting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:43:09 +02:00