docs: add observed batching results to training guide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-06 00:05:16 +02:00
parent 1f8cd6f930
commit 39984f73c2
+2
View File
@@ -199,6 +199,8 @@ The table below gives a rough scaling guide. Quality and diversity of recordings
Higher batch size gives smoother loss curves and faster convergence. If you have headroom, prefer larger batches over more steps. Higher batch size gives smoother loss curves and faster convergence. If you have headroom, prefer larger batches over more steps.
**Observed results:** batch 16 reaches the same loss in ~2600 steps that batch 1 needed 8000+ steps to reach, with a near-perfectly smooth curve. On a 24 GB GPU, batch 16 is the recommended default for `large_44k`.
### Rank ### Rank
| Rank | Use case | | Rank | Use case |