45fced55bc
L2-SP anchors trainable params to their pretrained values. GAFilter is a newly initialized module (identity FIR filter) with no pretrained values — anchoring it to identity initialization would resist learning. Exclude gafilter params from the L2-SP loss so they train freely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>