feat: add data_utils package with FeaturesUtils implementation

Creates data_utils/v2a_utils/feature_utils_288.py with FeaturesUtils:
- T5-Gemma text encoding via transformers
- VideoPrism video encoding via JAX videoprism package
- Synchformer visual encoder loading from checkpoint

Also fixes extract_features.py to add plugin root to sys.path so
data_utils is importable in the subprocess venv.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-27 20:14:34 +01:00
parent f32456a142
commit 878025450a
4 changed files with 245 additions and 5 deletions
+7 -5
View File
@@ -1,14 +1,10 @@
#!/usr/bin/env python3
"""
Standalone PrismAudio feature extraction script.
Run in a separate conda env with JAX/TF installed.
Runs in a separate Python env with JAX/TF installed (auto-created by PrismAudioFeatureExtractor).
Usage:
python extract_features.py --video input.mp4 --cot_text "description..." --output features.npz
Setup:
conda env create -f environment.yml
conda activate prismaudio-extract
"""
import argparse
@@ -17,6 +13,12 @@ import sys
import numpy as np
import torch
# Add plugin root to sys.path so data_utils (and prismaudio_core) are importable
_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
_PLUGIN_DIR = os.path.dirname(_SCRIPT_DIR)
if _PLUGIN_DIR not in sys.path:
sys.path.insert(0, _PLUGIN_DIR)
def main():
parser = argparse.ArgumentParser(description="PrismAudio feature extraction")