1be07a80d2
- Add lr_schedule param (constant|cosine) to SelvaLoraTrainer - Cosine decays LR from initial value to ~0 after warmup, preventing the oscillation observed at steps 6000-8000 with lr=2e-4 flat - Wire lr_schedule through scheduler _PARAM_DEFAULTS and _train_inner call - Add g5_r128_lr_2e4_cosine and g5_r128_lr_3e4_cosine to r128_sweet_spot sweep Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>