Talking-head Models¶
This page is the selection overview for talking-head backends. OpenTalking owns session orchestration, TTS, events, and WebRTC; model weight loading, GPU/NPU scheduling, and inference throughput belong to the selected backend.
Recommended Paths¶
| Model | Backend | Best for | Evidence level | Details |
|---|---|---|---|---|
mock |
mock |
First run, CI, API/WebRTC debugging | Built in, verified | Mock |
wav2lip |
local / omnirt |
First real lip-sync model | Local adapter is built in; OmniRT path verified | Local / OmniRT |
musetalk |
local / omnirt / direct_ws |
MuseTalk quality with either in-process startup or an external service | Local adapter is built in; OmniRT/direct_ws paths documented | Local / OmniRT |
quicktalk |
local / omnirt |
Local realtime adapter and service deployment reference | Local is built in; OmniRT path documented | Local / Apple Silicon / OmniRT |
fasterliveportrait |
omnirt |
Single-GPU realtime audio-driven portrait with pasteback | Documented | FasterLivePortrait |
flashtalk |
omnirt |
High-quality private GPU/NPU deployment | OmniRT/Ascend path verified | FlashTalk |
flashhead |
direct_ws |
Existing standalone FlashHead service | Documented | FlashHead |
Backend Behavior¶
| Backend | What OpenTalking expects | Typical models |
|---|---|---|
mock |
No external runtime; always available. | mock |
local |
Adapter can be imported in-process and dependencies are satisfied. | wav2lip, quicktalk, musetalk |
direct_ws |
The model service exposes its own WebSocket URL. | flashhead, custom single-model services |
omnirt |
OmniRT exposes /v1/audio2video/{model}. |
wav2lip, musetalk, fasterliveportrait, flashtalk |
Common Setup¶
export DIGITAL_HUMAN_HOME="$HOME/digital-human"
export OMNIRT_MODEL_ROOT="$DIGITAL_HUMAN_HOME/models"
export OPENTALKING_HOME="${OPENTALKING_HOME:-$DIGITAL_HUMAN_HOME/opentalking}"
export OMNIRT_HOME="${OMNIRT_HOME:-$DIGITAL_HUMAN_HOME/omnirt}"
export FASTERLIVEPORTRAIT_HOME="${FASTERLIVEPORTRAIT_HOME:-$DIGITAL_HUMAN_HOME/FasterLivePortrait}"
mkdir -p "$DIGITAL_HUMAN_HOME" "$OMNIRT_MODEL_ROOT"
cd "$DIGITAL_HUMAN_HOME"
Recommended layout:
$DIGITAL_HUMAN_HOME/
├── opentalking/
├── omnirt/ # Optional, only for backend: omnirt
├── models/
│ ├── wav2lip/
│ ├── SoulX-FlashTalk-14B/
│ ├── chinese-wav2vec2-base/
│ ├── quicktalk/
│ └── FasterLivePortrait/
├── logs/
└── run/
Download tools:
Common model sources:
Common Startup Combinations¶
The commands below only use existing repository entrypoints. No additional scripts are required.
OpenTalking local: QuickTalk and Wav2Lip in one frontend¶
In the default configuration, wav2lip already uses the local backend. The command
below only overrides quicktalk to local, so the same frontend can select both
quicktalk and wav2lip:
cd "$OPENTALKING_HOME"
uv sync --extra dev --extra models --python 3.11
export OPENTALKING_TORCH_DEVICE=cuda:0
export OPENTALKING_QUICKTALK_ASSET_ROOT="$OPENTALKING_HOME/models/quicktalk"
export OPENTALKING_QUICKTALK_WORKER_CACHE=1
export OPENTALKING_WAV2LIP_MODEL_ROOT="$OPENTALKING_HOME/models/wav2lip"
export OPENTALKING_WAV2LIP_DEVICE=cuda
export OPENTALKING_WAV2LIP_BATCH_SIZE=16
export OPENTALKING_WAV2LIP_MAX_LONG_EDGE=832
export OPENTALKING_WAV2LIP_FACE_DET_DEVICE=cpu
bash scripts/start_unified.sh --backend local --model quicktalk --api-port 8210 --web-port 5280
OmniRT: QuickTalk and Wav2Lip behind one endpoint¶
OpenTalking configures a single OMNIRT_ENDPOINT. To use both quicktalk and
wav2lip through OmniRT from the same frontend, enable both runtimes in the same
OmniRT process:
cd "$OMNIRT_HOME"
uv sync --extra server --extra wav2lip-cuda --extra quicktalk-cuda --python 3.11
source .venv/bin/activate
export OMNIRT_MODEL_ROOT="$DIGITAL_HUMAN_HOME/models"
export OMNIRT_ALLOWED_FRAME_ROOTS="$OPENTALKING_HOME/examples/avatars"
export OMNIRT_WAV2LIP_RUNTIME=1
export OMNIRT_WAV2LIP_MODELS_DIR="$OMNIRT_MODEL_ROOT"
export OMNIRT_WAV2LIP_CHECKPOINT="$OMNIRT_MODEL_ROOT/wav2lip/wav2lip384.pth"
export OMNIRT_WAV2LIP_DEVICE=cuda
export OMNIRT_WAV2LIP_FACE_DET_DEVICE=cpu
export OMNIRT_WAV2LIP_BATCH_SIZE=16
export OMNIRT_WAV2LIP_MAX_LONG_EDGE=832
export OMNIRT_WAV2LIP_PRELOAD=1
export OMNIRT_QUICKTALK_RUNTIME=1
export OMNIRT_QUICKTALK_MODEL_ROOT="$OMNIRT_MODEL_ROOT/quicktalk"
export OMNIRT_QUICKTALK_CHECKPOINT="$OMNIRT_MODEL_ROOT/quicktalk/checkpoints/quicktalk.pth"
export OMNIRT_QUICKTALK_DEVICE=cuda:0
export OMNIRT_QUICKTALK_HUBERT_DEVICE=cuda:0
export OMNIRT_QUICKTALK_MAX_LONG_EDGE=900
export OMNIRT_QUICKTALK_MAX_TEMPLATE_SECONDS=1
omnirt serve-avatar-ws --host 0.0.0.0 --port 9000 --backend cuda
Then start OpenTalking from another terminal. In the default configuration,
quicktalk already uses the omnirt backend. The command below only overrides
wav2lip to omnirt:
cd "$OPENTALKING_HOME"
bash scripts/start_unified.sh \
--backend omnirt \
--model wav2lip \
--omnirt http://127.0.0.1:9000 \
--api-port 8310 \
--web-port 5380
Common Verification¶
For OmniRT-backed models:
Common Status Values¶
| Status | Meaning | Action |
|---|---|---|
connected=true |
The backend is usable for sessions. | Choose a matching avatar and model in the browser. |
reason=not_configured |
Endpoint or WebSocket URL is empty. | Configure OMNIRT_ENDPOINT or the model-specific WS_URL. |
reason=omnirt_unavailable |
OmniRT reachability or model registration issue. | Check OmniRT /v1/audio2video/models, model list, and logs. |
reason=local_adapter_missing |
Configured as local, but no local adapter is registered. |
Switch backend or add a local adapter. |