8 Commits

Author SHA1 Message Date
Ethanfel 95cf706b19 feat: add multi-speaker generation with JS-powered dynamic slots
- Add OmniVoiceSpeaker node (label + ref_audio + ref_text → OMNIVOICE_SPEAKER)
- Add OmniVoiceSpeakers node (roster with dynamic speaker_N inputs driven by
  num_speakers INT widget; slots expand/collapse via ComfyUI JS extension)
- Add web/multi_speaker.js: ComfyUI extension that hooks onNodeCreated and
  onConfigure to sync speaker_N inputs in real time (max 8 speakers)
- Extend OmniVoiceGenerate with optional speakers (OMNIVOICE_SPEAKERS) input;
  when connected it routes each paragraph to the assigned speaker and
  concatenates the results — supports alternate_paragraphs and tagged_speakers modes
- Remove OmniVoiceMultiSpeakerGenerate (generation now lives in the existing
  Generate node)
- Refactor generator.py: extract _write_tmp_wav helper, add _tensors_to_audio

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 09:08:23 +02:00
Ethanfel c1558efad9 feat: add Voice Design node + language and guidance_scale to Generate
OmniVoiceVoiceDesign: structured dropdowns for gender/age/pitch/accent
that compose into an instruct string — wire to Generate's instruct input.

OmniVoiceGenerate: new optional language dropdown (auto + 11 languages)
and guidance_scale (CFG, default 2.0) parameters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 20:02:06 +02:00
Ethanfel c7c7123068 feat: add OmniVoice Mix Voices node for blended speaker cloning
Concatenates 2-3 reference audio clips (with per-voice duration weights)
to create a blended speaker embedding. Merges transcripts for ref_text.
Handles mismatched sample rates and mono conversion automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 19:03:43 +02:00
Ethanfel 8de201a4c9 Add OmniVoice Voice Preset node with two female voice samples
Two built-in presets, auto-downloaded and cached to ComfyUI/models/omnivoice/presets/:
- "Nature – female, warm" (F5-TTS basic_ref_en.wav, transcript included)
- "Shadowheart – female, expressive" (Chatterbox demo, connect Whisper for transcript)

Outputs ref_audio (AUDIO) and ref_text (STRING) — wire directly into
OmniVoice Generate. Updated default workflow to use this node.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 18:19:29 +02:00
Ethanfel 5dfaa0b300 Replace torchaudio.save with soundfile.write; add EPUB loader node
- nodes/generator.py: swap torchaudio.save for soundfile.write to avoid
  torchcodec/FFmpeg dependency crash in environments without FFmpeg shared libs
- nodes/epub_loader.py: new OmniVoiceEpubLoader node for loading EPUB chapters
- tests/test_epub_loader.py: 8 tests for the EPUB loader
- install.py: add beautifulsoup4 to runtime deps
- __init__.py, nodes/__init__.py: register OmniVoiceEpubLoader

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 17:24:18 +02:00
Ethanfel 11beba1c47 fix: clean up omnivoice import guard and __init__ error masking
Remove OmniVoice = None fallback in nodes/loader.py so missing omnivoice
gives a clear ImportError instead of a confusing AttributeError. Restore
__init__.py to clean form without the try/except that silently swallowed
real import errors. Add omnivoice mock to conftest.py and register a
pytest plugin that prevents pytest from treating the project root as a
Package node (which would try to import __init__.py outside a package
context and fail on the relative import).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 09:03:34 +02:00
Ethanfel 069169485d feat: add OmniVoiceModelLoader node
Implements OmniVoiceModelLoader with INPUT_TYPES, RETURN_TYPES, and
load_model supporting both HuggingFace auto-download and local path
sources. Adds TDD test suite and pytest infrastructure (conftest.py,
pytest.ini) to enable testing outside ComfyUI without omnivoice installed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 08:52:26 +02:00
Ethanfel 0ed43a83ca feat: scaffold ComfyUI-Omnivoice node package 2026-04-05 08:43:17 +02:00