From 39984f73c292f7036533e1df95ee074ee05dddad Mon Sep 17 00:00:00 2001
From: Ethanfel <ethan.fel@ts-pc.fr>
Date: Mon, 6 Apr 2026 00:05:16 +0200
Subject: [PATCH] docs: add observed batching results to training guide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 LORA_TRAINING.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/LORA_TRAINING.md b/LORA_TRAINING.md
index 3d32605..cac7fc8 100644
--- a/LORA_TRAINING.md
+++ b/LORA_TRAINING.md
@@ -199,6 +199,8 @@ The table below gives a rough scaling guide. Quality and diversity of recordings
 
 Higher batch size gives smoother loss curves and faster convergence. If you have headroom, prefer larger batches over more steps.
 
+**Observed results:** batch 16 reaches the same loss in ~2600 steps that batch 1 needed 8000+ steps to reach, with a near-perfectly smooth curve. On a 24 GB GPU, batch 16 is the recommended default for `large_44k`.
+
 ### Rank
 
 | Rank | Use case |