08d73773c5
Adds PrismAudioLoRATrainer and PrismAudioLoRALoader nodes enabling low-rank adaptation of the DiT on paired (video features + audio) datasets. - LoRALinear wraps nn.Linear with trainable lora_A/lora_B matrices - Rectified flow training loop with fp16 GradScaler, AdamW, cfg dropout - Checkpoint saving every N steps + _config.json metadata alongside weights - _unapply_lora restores base model state after training completes - Weight-merge loader: delta_W added in-place, no deep copy overhead - Three target presets: attn_only, attn_ffn (default), full Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 lines
178 B
Plaintext
13 lines
178 B
Plaintext
einops>=0.7.0
|
|
einops-exts
|
|
safetensors
|
|
huggingface_hub
|
|
transformers>=4.52.3
|
|
k-diffusion>=0.1.1
|
|
alias-free-torch
|
|
descript-audio-codec
|
|
vector-quantize-pytorch
|
|
scipy
|
|
tqdm
|
|
torchaudio
|