c0d95ce356
First TI sweep covering the three most impactful axes: - token_count group: n_tokens 4 / 8 / 16 (capacity vs overfitting) - learning_rate group: 5e-4 / 1e-3 / 2e-3 with n_tokens=4 - warm_init group: n4 and n8 seeded from 'mechanical impact sound design' 7 experiments total, 3000 steps each, same data_dir as LoRA sweeps. n4_baseline (lr=1e-3, random init) is the primary reference point. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>