Wav2Lip Local Deployment¶
Local mode uses OpenTalking's built-in Wav2Lip adapter. It is the lightest path for validating real lip sync and works well for single-GPU demos and avatar-asset checks.
Use Cases¶
- First move from
mockto a real talking-head model. - Run inference inside the OpenTalking process without deploying OmniRT.
- Use built-in or custom shared avatars, and let the Wav2Lip flow consume reference images or frame assets as needed.
Weight Preparation¶
Terminal
cd "$DIGITAL_HUMAN_HOME/opentalking"
mkdir -p models/wav2lip
uv pip install -U "huggingface_hub[cli]"
export HF_ENDPOINT="${HF_ENDPOINT:-https://hf-mirror.com}"
hf download Pypa/wav2lip384 \
wav2lip384.pth \
--local-dir models/wav2lip
hf download rippertnt/wav2lip \
s3fd.pth \
--local-dir models/wav2lip
stat models/wav2lip/wav2lip384.pth
stat models/wav2lip/s3fd.pth
Start Command¶
Terminal
cd "$DIGITAL_HUMAN_HOME/opentalking"
uv sync --extra dev --extra models --python 3.11
export OPENTALKING_WAV2LIP_MODEL_ROOT="$DIGITAL_HUMAN_HOME/opentalking/models/wav2lip"
export OPENTALKING_WAV2LIP_DEVICE=cuda
export OPENTALKING_WAV2LIP_BATCH_SIZE=16
export OPENTALKING_WAV2LIP_MAX_LONG_EDGE=832
export OPENTALKING_WAV2LIP_FACE_DET_DEVICE=cpu
bash scripts/start_unified.sh --backend local --model wav2lip --api-port 8210 --web-port 5280
Open http://localhost:5280, choose an available avatar, and select the wav2lip
model.
Verification¶
Terminal
curl -fsS http://127.0.0.1:8210/health
curl -s http://127.0.0.1:8210/models | jq '.statuses[] | select(.id=="wav2lip")'
Expect backend=local and connected=true. The first load initializes the checkpoint, S3FD, and avatar cache, which can take tens of seconds.
Common Errors¶
| Symptom | Action |
|---|---|
| Checkpoint not found | Check OPENTALKING_WAV2LIP_MODEL_ROOT and both .pth files. |
| Out of GPU memory | Lower OPENTALKING_WAV2LIP_BATCH_SIZE or OPENTALKING_WAV2LIP_MAX_LONG_EDGE. |
| Slow first frame | Set OPENTALKING_PREWARM_AVATARS=singer for common avatars. |
| Enhancement mode fails | easy_enhanced requires GFPGAN and OPENTALKING_WAV2LIP_GFPGAN_CHECKPOINT. |