QuickTalk OmniRT Deployment¶
OmniRT mode runs QuickTalk inference outside the OpenTalking process. Use it when multiple models share one service endpoint, GPU dependencies need isolation, or inference runs on a separate machine.
Use Cases¶
- OpenTalking owns sessions, TTS, and WebRTC while QuickTalk is served externally.
- One OmniRT endpoint needs to expose
quicktalk,wav2lip, and other models. - Web-service resources and inference GPU resources need separate scaling.
Weight Preparation¶
OmniRT reads $OMNIRT_MODEL_ROOT/quicktalk by default:
Terminal
export DIGITAL_HUMAN_HOME="$HOME/digital-human"
export OMNIRT_MODEL_ROOT="$DIGITAL_HUMAN_HOME/models"
mkdir -p "$OMNIRT_MODEL_ROOT/quicktalk/checkpoints"
uv pip install -U "huggingface_hub[cli]"
export HF_ENDPOINT="${HF_ENDPOINT:-https://hf-mirror.com}"
hf download datascale-ai/quicktalk \
quicktalk.pth \
repair.npy \
chinese-hubert-large/config.json \
chinese-hubert-large/preprocessor_config.json \
chinese-hubert-large/pytorch_model.bin \
--local-dir "$OMNIRT_MODEL_ROOT/quicktalk/checkpoints"
Confirm quicktalk.pth, repair.npy, HuBERT, and InsightFace buffalo_l all exist under the QuickTalk model directory. Prepare InsightFace as shown in Local.
Start Command¶
Start OmniRT first:
Terminal
cd "$OMNIRT_HOME"
uv sync --extra server --extra quicktalk-cuda --python 3.11
source .venv/bin/activate
export OMNIRT_QUICKTALK_RUNTIME=1
export OMNIRT_QUICKTALK_MODEL_ROOT="$OMNIRT_MODEL_ROOT/quicktalk"
export OMNIRT_QUICKTALK_CHECKPOINT="$OMNIRT_MODEL_ROOT/quicktalk/checkpoints/quicktalk.pth"
export OMNIRT_QUICKTALK_DEVICE=cuda:0
export OMNIRT_QUICKTALK_HUBERT_DEVICE=cuda:0
export OMNIRT_QUICKTALK_MAX_LONG_EDGE=900
export OMNIRT_QUICKTALK_MAX_TEMPLATE_SECONDS=1
omnirt serve-avatar-ws --host 0.0.0.0 --port 9000 --backend cuda
Then start OpenTalking:
Terminal
cd "$DIGITAL_HUMAN_HOME/opentalking"
bash scripts/start_unified.sh \
--backend omnirt \
--model quicktalk \
--omnirt http://127.0.0.1:9000 \
--api-port 8310 \
--web-port 5380
Verification¶
Terminal
curl -fsS http://127.0.0.1:9000/v1/audio2video/models | jq
curl -s http://127.0.0.1:8310/models | jq '.statuses[] | select(.id=="quicktalk")'
OpenTalking should report backend=omnirt and connected=true.
Common Errors¶
| Symptom | Action |
|---|---|
reason=omnirt_unavailable |
Check the OmniRT port, OMNIRT_ENDPOINT, and /v1/audio2video/models. |
OmniRT does not list quicktalk |
Check OMNIRT_QUICKTALK_RUNTIME=1, checkpoint paths, and startup logs. |
| Slow first frame or high VRAM | Tune OMNIRT_QUICKTALK_MAX_LONG_EDGE, HuBERT device, or prewarm strategy. |
| Avatar asset unavailable | Check that the selected avatar is uploaded, readable, and the session configuration is complete. |