ComfyUI-Omnivoice

Author	SHA1	Message	Date
Ethanfel	8805665a22	Add seed parameter to OmniVoice Generate for consistent voice across chunks OmniVoice chunks long text internally; each chunk is a separate diffusion pass with different random noise, causing voice drift between paragraphs. Setting the same seed before each generate() call anchors the RNG state and keeps the voice consistent. seed=0 means random (default behaviour). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:53:58 +02:00
Ethanfel	4c42322c6f	Expand voice presets to 8 voices (3 female, 5 male) All transcribed via whisper-medium. Sources: Chatterbox demo GCS bucket (ResembleAI) and F5-TTS repo (SWivid). Female: Shadowheart, American actress, Podcast host Male: Nature, Old Hollywood, Rick Sanchez, Stewie Griffin, Harvey Keitel, Conan O'Brien Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:51:43 +02:00
Ethanfel	c109e860a8	Add transcript for Shadowheart preset (transcribed via whisper-medium) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:39:14 +02:00
Ethanfel	75e74075f5	Restore Shadowheart preset; user will transcribe via Whisper node Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:33:52 +02:00
Ethanfel	8de201a4c9	Add OmniVoice Voice Preset node with two female voice samples Two built-in presets, auto-downloaded and cached to ComfyUI/models/omnivoice/presets/: - "Nature – female, warm" (F5-TTS basic_ref_en.wav, transcript included) - "Shadowheart – female, expressive" (Chatterbox demo, connect Whisper for transcript) Outputs ref_audio (AUDIO) and ref_text (STRING) — wire directly into OmniVoice Generate. Updated default workflow to use this node. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:19:29 +02:00
Ethanfel	d779526225	Preserve paragraph breaks in EPUB text extraction get_text(separator=' ') collapsed all paragraphs into one line. Now inserts \n\n at block-level element boundaries (p, h1-h6, div, li, br, tr) before extraction, then normalises whitespace. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:06:41 +02:00
Ethanfel	b52edcfd84	Remove local path option from model loader Models always download to ComfyUI/models/omnivoice/ via HuggingFace. Local path added unnecessary complexity; users who want a custom path can symlink into the models directory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:02:55 +02:00
Ethanfel	cd0f7aff07	Add default voice cloning workflow Model Loader → Load Audio → OmniVoice Generate → Save Audio. Connect a Whisper node to ref_text for auto-transcription. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 18:01:18 +02:00
Ethanfel	8d77dd6cd5	Remove torchcodec workaround; recommend Whisper node for ref_text Users should connect a ComfyUI Whisper node to ref_text instead of relying on omnivoice's internal ASR. Removes the error-catch workaround and updates the tooltip accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:49:25 +02:00
Ethanfel	a3fb88e559	Restore install.py for omnivoice --no-deps only requirements.txt cannot install omnivoice (it would pull in torch==2.8.* and break ComfyUI). install.py now does exactly one thing: install omnivoice --no-deps, skipped if already present. All other deps remain in requirements.txt for ComfyUI Manager to handle normally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:45:24 +02:00
Ethanfel	dbb3207df1	Replace install.py with standard requirements.txt install.py was running arbitrary pip installs as part of node loading, which is dangerous in a shared venv. Standard approach: requirements.txt lists the safe deps (transformers, accelerate, soundfile, etc.); omnivoice itself must be installed once manually with --no-deps to avoid overwriting ComfyUI's torch. README documents this clearly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:44:52 +02:00
Ethanfel	e8e8943692	Remove transformers upper bound cap from install.py The cap was wrong — it would downgrade transformers in shared venvs and break other nodes. The torchcodec issue is handled in code now. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:36:06 +02:00
Ethanfel	30f46fc3ef	Revert transformers cap; catch torchcodec ASR failure with clear message install.py: restore transformers>=5.0.0 (capping it would break other nodes). generator.py: catch the torchcodec RuntimeError that fires when ref_text is blank and transformers 5.x auto-transcription requires missing FFmpeg libs. Raises a human-readable error telling the user to fill in ref_text manually. Also updates the ref_text tooltip to recommend providing it explicitly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:35:54 +02:00
Ethanfel	d6ff42dc7c	Cap transformers below 5.0 to avoid torchcodec ASR crash transformers 5.x unconditionally imports torchcodec in its ASR pipeline preprocess step, which crashes in environments without FFmpeg shared libs. 4.x does not have this dependency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:32:50 +02:00
Ethanfel	5dfaa0b300	Replace torchaudio.save with soundfile.write; add EPUB loader node - nodes/generator.py: swap torchaudio.save for soundfile.write to avoid torchcodec/FFmpeg dependency crash in environments without FFmpeg shared libs - nodes/epub_loader.py: new OmniVoiceEpubLoader node for loading EPUB chapters - tests/test_epub_loader.py: 8 tests for the EPUB loader - install.py: add beautifulsoup4 to runtime deps - __init__.py, nodes/__init__.py: register OmniVoiceEpubLoader Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 17:24:18 +02:00
Ethanfel	5366f6992e	feat: add tooltips with inline tag reference to generator node inputs	2026-04-05 15:35:21 +02:00
Ethanfel	f647a24988	fix: add install.py to prevent omnivoice from overwriting ComfyUI's torch	2026-04-05 14:53:33 +02:00
Ethanfel	b273f8f2d7	refactor: remove redundant condition and rename shadowed waveform variable	2026-04-05 10:41:08 +02:00
Ethanfel	0760e60373	chore: remove torch/torchaudio from requirements (omnivoice declares them)	2026-04-05 10:39:44 +02:00
Ethanfel	0ffd624471	fix: protect os.unlink in finally block from masking original exceptions	2026-04-05 10:38:38 +02:00
Ethanfel	a2c542a2bc	fix: move output waveform to CPU and cast sample_rate to int	2026-04-05 10:34:53 +02:00
Ethanfel	49b1ee5c16	docs: add README	2026-04-05 10:31:40 +02:00
Ethanfel	808580b771	fix: guard omnivoice import in loader.py so node classes are importable without package Wrap `from omnivoice import OmniVoice` in a try/except ImportError, setting OmniVoice=None when absent. Add a clear runtime ImportError in load_model() so users get an actionable message. Allows `from nodes.loader import OmniVoiceModelLoader` to succeed outside of pytest (where conftest.py injects the mock) while keeping all 13 tests green. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 09:11:09 +02:00
Ethanfel	18fe6359cf	fix: add input validation and cpu() guard in OmniVoiceGenerate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 09:09:52 +02:00
Ethanfel	95712e5504	feat: add OmniVoiceGenerate node with voice cloning, design, and auto modes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 09:07:20 +02:00
Ethanfel	7e94733b21	docs: document private pytest API usage in conftest	2026-04-05 09:05:53 +02:00
Ethanfel	11beba1c47	fix: clean up omnivoice import guard and __init__ error masking Remove OmniVoice = None fallback in nodes/loader.py so missing omnivoice gives a clear ImportError instead of a confusing AttributeError. Restore __init__.py to clean form without the try/except that silently swallowed real import errors. Add omnivoice mock to conftest.py and register a pytest plugin that prevents pytest from treating the project root as a Package node (which would try to import __init__.py outside a package context and fail on the relative import). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 09:03:34 +02:00
Ethanfel	069169485d	feat: add OmniVoiceModelLoader node Implements OmniVoiceModelLoader with INPUT_TYPES, RETURN_TYPES, and load_model supporting both HuggingFace auto-download and local path sources. Adds TDD test suite and pytest infrastructure (conftest.py, pytest.ini) to enable testing outside ComfyUI without omnivoice installed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 08:52:26 +02:00
Ethanfel	0ed43a83ca	feat: scaffold ComfyUI-Omnivoice node package	2026-04-05 08:43:17 +02:00

29 Commits