First TI sweep covering the three most impactful axes:
- token_count group: n_tokens 4 / 8 / 16 (capacity vs overfitting)
- learning_rate group: 5e-4 / 1e-3 / 2e-3 with n_tokens=4
- warm_init group: n4 and n8 seeded from 'mechanical impact sound design'
7 experiments total, 3000 steps each, same data_dir as LoRA sweeps.
n4_baseline (lr=1e-3, random init) is the primary reference point.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>