feat: SelVA Skip Experiment node + save partial scalars on skip

- New node: SelVA Skip Experiment — writes skip_current.flag from UI,
  queue in a second workflow tab while scheduler is running
- SkipExperiment now attaches partial loss/grad/spectral data to the
  exception so the scheduler saves all collected scalars in the summary

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-08 13:10:43 +02:00
parent 264dc49d42
commit 58e1985af2
4 changed files with 69 additions and 4 deletions
+8 -1
View File
@@ -758,7 +758,14 @@ class SelvaLoraTrainer:
skip_flag = output_dir.parent / "skip_current.flag"
if skip_flag.exists():
skip_flag.unlink()
raise SkipExperiment(f"skip_current.flag detected at step {step} — skipping to next experiment")
exc = SkipExperiment(f"skip_current.flag detected at step {step} — skipping to next experiment")
exc.partial = {
"loss_history": list(loss_history),
"grad_norm_history": list(grad_norm_history),
"spectral_metrics": dict(spectral_metrics),
"stopped_at_step": step,
}
raise exc
avg = running_loss / log_interval
loss_history.append(avg)