Initial release: ComfyUI-MisoTTS (modernized CSM 8B)

Modernized MisoTTS integration for ComfyUI with no torchtune/moshi:
- vendored plain-torch Llama backbone (csm_llama), parity-verified Δ=0 vs torchtune
- transformers.MimiModel codec (bit-identical codes to moshi), drops moshi/bnb/sphn
- low-memory loader: streams 32GB fp32 checkpoint to GPU in bf16 (~18GB VRAM)
- nodes: Model Loader, Generate (audiobook chunking + voice anchoring), EPUB Loader
- pin-free requirements; runs on modern torch / Blackwell GPUs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-06 23:37:54 +02:00
commit f7a6f7790d
13 changed files with 1110 additions and 0 deletions
+15
View File
@@ -0,0 +1,15 @@
from .nodes import MisoTTSModelLoader, MisoTTSGenerate, MisoTTSEpubLoader
NODE_CLASS_MAPPINGS = {
"MisoTTSModelLoader": MisoTTSModelLoader,
"MisoTTSGenerate": MisoTTSGenerate,
"MisoTTSEpubLoader": MisoTTSEpubLoader,
}
NODE_DISPLAY_NAME_MAPPINGS = {
"MisoTTSModelLoader": "MisoTTS Model Loader",
"MisoTTSGenerate": "MisoTTS Generate",
"MisoTTSEpubLoader": "MisoTTS EPUB Loader",
}
__all__ = ["NODE_CLASS_MAPPINGS", "NODE_DISPLAY_NAME_MAPPINGS"]