- Generate: language dropdown (auto/English/Chinese), passed only in
voice_design and auto_voice modes where it selects the instruct vocab
- VoiceDesign: Chinese mode with dialect/age/pitch/gender dropdowns
using the model's validated Chinese instruct vocabulary (全角逗号)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The model's _resolve_instruct() validates against a fixed vocabulary.
Only 10 accents are supported — removed all unsupported additions.
Updated tooltip to reflect actual constraints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If instruct is set alongside ref_audio, it is now forwarded to
model.generate() — allowing accent/style transfer on top of the
cloned voice identity. Model may or may not honour both simultaneously.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Language: ~170 world languages with type-to-filter dropdown
Accent: 50+ regional varieties grouped by area
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OmniVoiceVoiceDesign: structured dropdowns for gender/age/pitch/accent
that compose into an instruct string — wire to Generate's instruct input.
OmniVoiceGenerate: new optional language dropdown (auto + 11 languages)
and guidance_scale (CFG, default 2.0) parameters.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OmniVoice chunks long text internally; each chunk is a separate diffusion
pass with different random noise, causing voice drift between paragraphs.
Setting the same seed before each generate() call anchors the RNG state
and keeps the voice consistent. seed=0 means random (default behaviour).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Users should connect a ComfyUI Whisper node to ref_text instead of
relying on omnivoice's internal ASR. Removes the error-catch workaround
and updates the tooltip accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
install.py: restore transformers>=5.0.0 (capping it would break other nodes).
generator.py: catch the torchcodec RuntimeError that fires when ref_text is
blank and transformers 5.x auto-transcription requires missing FFmpeg libs.
Raises a human-readable error telling the user to fill in ref_text manually.
Also updates the ref_text tooltip to recommend providing it explicitly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>