MuseTalk¶
Support Status¶
| Item | Value |
|---|---|
| Model ID | musetalk |
| Backend | omnirt, direct_ws, or local |
| Evidence level | Local adapter is wired; local mode runs official MuseTalk preprocessing before session initialization |
| Best for | Teams that need MuseTalk quality while keeping startup orchestration in OpenTalking |
Recommended Hardware¶
Single GPU or remote model service. local mode should use a CUDA GPU; the first
session for an avatar also loads DWPose, face parsing, and the VAE for official
preprocessing.
Weights¶
Upstream sources:
- TMElyralab/MuseTalk
- MuseTalk on Hugging Face
- ModelScope search for MuseTalk
- Modelers search for MuseTalk
For local mode, place these weights under DIGITAL_HUMAN_HOME/models, or point
OPENTALKING_MUSETALK_MODEL_ROOT at an equivalent directory:
models/
musetalk/
musetalk.json
pytorch_model.bin
sd-vae-ft-mse/
config.json
diffusion_pytorch_model.bin
diffusion_pytorch_model.safetensors
whisper/
tiny.pt
dwpose/
dw-ll_ucoco_384.pth
face-parse-bisenet/
79999_iter.pth
Directory Layout¶
omnirt and direct_ws modes let the external service own the MuseTalk runtime.
In local mode, OpenTalking loads the weights directly and needs the official
MuseTalk source checkout for avatar preprocessing:
DIGITAL_HUMAN_HOME/
models/
model-repos/
MuseTalk/
musetalk/utils/preprocessing.py
musetalk/utils/blending.py
runtimes/
musetalk-preprocess/
venv/bin/python
runtimes/musetalk-preprocess/venv must contain the full OpenMMLab stack,
especially mmcv with mmcv._ext; mmcv-lite is not enough for official
preprocessing. The main OpenTalking .venv may still use mmcv-lite for local
MuseTalk realtime inference. Official preprocessing is executed through
OPENTALKING_MUSETALK_PREPROCESS_PYTHON or the default Python path shown above.
Configuration¶
OmniRT path:
Local path:
Start¶
Point OpenTalking at an OmniRT service that exposes MuseTalk:
Local mode:
bash scripts/start_unified.sh --backend local --model musetalk --api-port 18000 --web-port 18173 --host 0.0.0.0
The command checks local MuseTalk inference dependencies. When a user enters a
conversation and creates a session, OpenTalking checks the selected avatar. If
prepared/prepared_info.json is missing, or it was not produced by
source_preprocess=musetalk_official, OpenTalking runs official MuseTalk
preprocessing first, writes the assets to the avatar's prepared/ directory, and
then loads the session.
/models Verification¶
When OmniRT or the local runtime provides the model, it should report
connected=true. For local mode:
Common Errors¶
| Symptom | Action |
|---|---|
reason=omnirt_unavailable |
Check that OmniRT reports /v1/audio2video/musetalk. |
No module named 'mmcv._ext' |
The preprocessing Python lacks full OpenMMLab dependencies; use an OPENTALKING_MUSETALK_PREPROCESS_PYTHON environment with full mmcv. |
| Session fails during preprocessing | Check that OPENTALKING_MUSETALK_REPO points to the official MuseTalk source and that dwpose and face-parse-bisenet weights exist. |
| Avatar mismatch | Use an avatar with model_type: musetalk. |