Bundle sparse_sage Triton kernel for block-sparse attention
Without sparse attention, the model uses full (dense) attention which attends to distant irrelevant information, causing ghosting artifacts. The FlashVSR paper explicitly requires block-sparse attention. Vendored from SageAttention team (Apache 2.0), pure Triton (no CUDA C++). Import chain: local sparse_sage → external sageattn.core → SDPA fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
3
flashvsr_arch/models/sparse_sage/__init__.py
Normal file
3
flashvsr_arch/models/sparse_sage/__init__.py
Normal file
@@ -0,0 +1,3 @@
|
||||
from .core import sparse_sageattn
|
||||
|
||||
__all__ = ["sparse_sageattn"]
|
||||
Reference in New Issue
Block a user