Newer open_clip creates nn.MultiheadAttention with batch_first=True,
but STAR's embedder unconditionally permutes to [seq, batch, embed].
This causes a RuntimeError in the text encoder (attn_mask shape
mismatch). The patch detects batch_first at runtime and only permutes
when needed.
Patches in patches/ are auto-applied to the STAR submodule on startup
and skip gracefully if already applied.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>