Chinese characters vs English words are self-identifying to the model.
No need for a separate language signal on either node.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Voice Design now outputs (instruct, language) — wire language directly
into Generate to avoid setting it in two places. Generate's language
input is now a STRING (accepts the connection or manual 'auto').
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Generate: language dropdown (auto/English/Chinese), passed only in
voice_design and auto_voice modes where it selects the instruct vocab
- VoiceDesign: Chinese mode with dialect/age/pitch/gender dropdowns
using the model's validated Chinese instruct vocabulary (全角逗号)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The model's _resolve_instruct() validates against a fixed vocabulary.
Only 10 accents are supported — removed all unsupported additions.
Updated tooltip to reflect actual constraints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If instruct is set alongside ref_audio, it is now forwarded to
model.generate() — allowing accent/style transfer on top of the
cloned voice identity. Model may or may not honour both simultaneously.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Language: ~170 world languages with type-to-filter dropdown
Accent: 50+ regional varieties grouped by area
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OmniVoiceVoiceDesign: structured dropdowns for gender/age/pitch/accent
that compose into an instruct string — wire to Generate's instruct input.
OmniVoiceGenerate: new optional language dropdown (auto + 11 languages)
and guidance_scale (CFG, default 2.0) parameters.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Compiles the model graph on first generation (~30-60s warmup) then
speeds up all subsequent generations in the session. Recommended for
audiobook pipelines. Default off.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Catching bare Exception was silently swallowing real resampling errors.
Only ImportError should trigger the interpolate fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _resample: squeeze batch dim before torchaudio.Resample (expected 2D)
- weight scaling: each clip now trims to natural_length*weight samples,
dropping the broken target_per_unit double-multiplication
- empty trimmed guard: raise clear error when all weights are 0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OmniVoice chunks long text internally; each chunk is a separate diffusion
pass with different random noise, causing voice drift between paragraphs.
Setting the same seed before each generate() call anchors the RNG state
and keeps the voice consistent. seed=0 means random (default behaviour).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
get_text(separator=' ') collapsed all paragraphs into one line.
Now inserts \n\n at block-level element boundaries (p, h1-h6, div,
li, br, tr) before extraction, then normalises whitespace.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Models always download to ComfyUI/models/omnivoice/ via HuggingFace.
Local path added unnecessary complexity; users who want a custom path
can symlink into the models directory.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Users should connect a ComfyUI Whisper node to ref_text instead of
relying on omnivoice's internal ASR. Removes the error-catch workaround
and updates the tooltip accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
install.py: restore transformers>=5.0.0 (capping it would break other nodes).
generator.py: catch the torchcodec RuntimeError that fires when ref_text is
blank and transformers 5.x auto-transcription requires missing FFmpeg libs.
Raises a human-readable error telling the user to fill in ref_text manually.
Also updates the ref_text tooltip to recommend providing it explicitly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap `from omnivoice import OmniVoice` in a try/except ImportError, setting
OmniVoice=None when absent. Add a clear runtime ImportError in load_model()
so users get an actionable message. Allows `from nodes.loader import
OmniVoiceModelLoader` to succeed outside of pytest (where conftest.py injects
the mock) while keeping all 13 tests green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove OmniVoice = None fallback in nodes/loader.py so missing omnivoice
gives a clear ImportError instead of a confusing AttributeError. Restore
__init__.py to clean form without the try/except that silently swallowed
real import errors. Add omnivoice mock to conftest.py and register a
pytest plugin that prevents pytest from treating the project root as a
Package node (which would try to import __init__.py outside a package
context and fail on the relative import).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements OmniVoiceModelLoader with INPUT_TYPES, RETURN_TYPES, and
load_model supporting both HuggingFace auto-download and local path
sources. Adds TDD test suite and pytest infrastructure (conftest.py,
pytest.ini) to enable testing outside ComfyUI without omnivoice installed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>