An installed-but-broken cupy (e.g. incompatible with NumPy 2.5, which
removed the 'bool8' alias) raises a TypeError during its own import, not
an ImportError. The narrow `except ImportError` guard let that propagate
and crashed the entire node import chain.
Broaden the guard to `except Exception` in all three CUDA-kernel modules
so any import-time failure disables cupy and falls back to the
pure-PyTorch implementations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wraps _pytorch_softsplat and _pytorch_costvol with torch.compile
for ~6x speedup on ROCm/non-cupy setups. Falls back to eager
execution gracefully if compilation fails.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cupy.cuda.get_cuda_path() can return None when CUDA_HOME is not set
and cupy can't auto-detect it. Fall back to /usr/local/cuda.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make cupy import optional so the module loads without cupy installed.
Replace @cupy.memoize decorator with a simple dict cache to avoid
crash at import time. Add _pytorch_softsplat() using scatter_add_
as a fallback when cupy is unavailable or tensors are on CPU.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SGM-VFI combines local flow estimation with sparse global matching
(GMFlow) to handle large motion and occlusion-heavy scenes. Adds 3 new
nodes: Load SGM-VFI Model, SGM-VFI Interpolate, SGM-VFI Segment
Interpolate. Architecture files vendored from MCG-NJU/SGM-VFI with
device-awareness fixes (no hardcoded .cuda()), relative imports, and
debug code removed. README updated with model comparison table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>