feat: SelVA Skip Experiment node + save partial scalars on skip
- New node: SelVA Skip Experiment — writes skip_current.flag from UI, queue in a second workflow tab while scheduler is running - SkipExperiment now attaches partial loss/grad/spectral data to the exception so the scheduler saves all collected scalars in the summary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -758,7 +758,14 @@ class SelvaLoraTrainer:
|
||||
skip_flag = output_dir.parent / "skip_current.flag"
|
||||
if skip_flag.exists():
|
||||
skip_flag.unlink()
|
||||
raise SkipExperiment(f"skip_current.flag detected at step {step} — skipping to next experiment")
|
||||
exc = SkipExperiment(f"skip_current.flag detected at step {step} — skipping to next experiment")
|
||||
exc.partial = {
|
||||
"loss_history": list(loss_history),
|
||||
"grad_norm_history": list(grad_norm_history),
|
||||
"spectral_metrics": dict(spectral_metrics),
|
||||
"stopped_at_step": step,
|
||||
}
|
||||
raise exc
|
||||
|
||||
avg = running_loss / log_interval
|
||||
loss_history.append(avg)
|
||||
|
||||
Reference in New Issue
Block a user