Install from source¶
A source installation provides the most flexibility and is required for development work and for Ascend NPU deployment. The procedure differs in detail depending on the target scenario; this page documents each variant.
If your environment matches the Docker-supported configurations, the Docker Compose installation is also a viable choice and may be operationally simpler.
Common steps¶
These steps are shared by all source installations.
1. Clone the repository¶
The repository assumes a parent directory that also contains the OmniRT checkout for deployments that use real talking-head models. The recommended layout:
$DIGITAL_HUMAN_HOME/
├── opentalking/
├── omnirt/
└── models/
├── wav2lip/
├── SoulX-FlashTalk-14B/
└── chinese-wav2vec2-base/
Set the environment variable:
export DIGITAL_HUMAN_HOME=/opt/digital_human # or your preferred path
export OMNIRT_MODEL_ROOT="$DIGITAL_HUMAN_HOME/models"
2. Install Python dependencies¶
The [dev] extra installs runtime dependencies plus ruff, pytest,
pytest-asyncio, pytest-cov, and related development tooling.
If you need the compatibility fallback instead:
python3 -m venv .venv
source .venv/bin/activate
pip install --index-url https://pypi.tuna.tsinghua.edu.cn/simple -e ".[dev]"
Notes:
- The lockfile is validated with Python 3.11.
- When PyAV resolves to a wheel, only runtime
ffmpegis required. - If you move to an unvalidated Python or PyAV combination and trigger a source build, you will also need
ffmpeg 7,pkg-config, and a C compiler.
3. Install frontend dependencies¶
4. Configure environment¶
Edit .env and set the minimum required variables:
Both variables receive the same DashScope API key. The complete configuration reference is in Configuration.
5. Verify the installation¶
Scenario: development with CPU and mock synthesis¶
For frontend development, API iteration, and orchestration changes on a workstation without GPU access.
Run the unified server¶
This launches the OpenTalking unified server on http://127.0.0.1:8000 and the Vite
development server on http://localhost:5173. The mock synthesis path returns
placeholder frames; no inference service is required.
For backend hot-reload during development:
The frontend is started separately:
System resources required:
- 1–2 GB of RAM for the unified process.
- No GPU.
- Network access to the configured language model and TTS endpoints.
Scenario: single GPU with Wav2Lip¶
For evaluation on a single NVIDIA 3090 or equivalent 24 GB GPU using the lightweight
wav2lip model.
Install OmniRT¶
OmniRT is the inference runtime. Clone it next to OpenTalking:
cd "$DIGITAL_HUMAN_HOME"
git clone https://github.com/datascale-ai/omnirt.git
cd omnirt
uv sync --extra server --python 3.11
Download model weights¶
mkdir -p "$OMNIRT_MODEL_ROOT/wav2lip"
# Place wav2lip384.pth and s3fd.pth at $OMNIRT_MODEL_ROOT/wav2lip/
# Refer to the OmniRT documentation for current download locations.
Start OmniRT¶
bash "$DIGITAL_HUMAN_HOME/opentalking/scripts/quickstart/start_omnirt_wav2lip.sh" --device cuda
The script handles dependency installation, environment variable setup, and starts
OmniRT on http://127.0.0.1:9000. The script writes the process ID to
$DIGITAL_HUMAN_HOME/run/omnirt-wav2lip.pid and logs to
$DIGITAL_HUMAN_HOME/logs/omnirt-wav2lip.log.
Configure OpenTalking¶
Append to .env:
Run OpenTalking¶
cd "$DIGITAL_HUMAN_HOME/opentalking"
bash scripts/quickstart/start_all.sh --omnirt http://127.0.0.1:9000
In the frontend, select the wav2lip model when creating a session.
System resources required:
- 1 NVIDIA GPU with 24 GB of VRAM.
- 16 GB of RAM.
- 5 GB of disk for the wav2lip checkpoints.
Scenario: single GPU with FlashTalk¶
For evaluation on a single NVIDIA 4090 or A100 using the SoulX FlashTalk-14B model.
The steps are identical to the wav2lip scenario, with the following changes:
bash "$DIGITAL_HUMAN_HOME/opentalking/scripts/quickstart/start_omnirt_flashtalk.sh" --device cuda
Model weights:
SoulX-FlashTalk-14B/(~28 GB) at$OMNIRT_MODEL_ROOT/.chinese-wav2vec2-base/(~400 MB) at$OMNIRT_MODEL_ROOT/.
System resources required:
- 1 NVIDIA GPU with at least 22 GB of free VRAM (4090 24 GB or A100 40 GB).
- 32 GB of RAM.
- 35 GB of disk for the FlashTalk checkpoints and the wav2vec2 base model.
Lower-VRAM configurations may be achieved by tuning the parameters documented in Configuration → FlashTalk rendering parameters.
Scenario: Ascend 910B¶
For NPU production deployment. Requires CANN 8.0 or later.
Verify CANN installation¶
test -f /usr/local/Ascend/ascend-toolkit/set_env.sh && echo "CANN present"
source /usr/local/Ascend/ascend-toolkit/set_env.sh
npu-smi info
Install OpenTalking¶
Complete the OpenTalking installation from the common steps first, preferably with Python 3.11. In China-friendly environments, set these mirrors before installing:
export UV_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple
export PIP_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple
The OmniRT installation requires the NPU-specific PyTorch wheel; the deployment script handles that part.
Deploy¶
source /usr/local/Ascend/ascend-toolkit/set_env.sh
cd "$DIGITAL_HUMAN_HOME/opentalking"
bash scripts/deploy_ascend_910b.sh
The script:
- Sources the CANN environment file.
- Verifies the sibling
omnirt/checkout, the OmniRT virtualenv, and thewav2lipmodel directory. - Configures the NPU-specific environment variables (
OMNIRT_WAV2LIP_DEVICE=npu,OMNIRT_WAV2LIP_FACE_DET_DEVICE=cpu). - Starts OmniRT via
scripts/quickstart/start_omnirt_wav2lip.sh --device npu.
Supported models¶
| Model | Status on Ascend 910B |
|---|---|
mock |
Supported |
wav2lip |
Supported via OmniRT --backend ascend |
flashtalk |
Supported |
musetalk |
Not currently ported |
System resources required:
- 1 Ascend 910B card (Atlas 800T or equivalent server).
- CANN 8.0 or later.
torch-npupackage, installed by the deployment script.
Scenario: API and Worker split¶
For production deployments that require horizontal Worker scaling or component isolation. The architecture and operational characteristics are documented in Deployment.
Prerequisites¶
In addition to the common installation:
- Redis 6 or later, reachable from both the API and Worker processes.
- A process manager (systemd, supervisor, Kubernetes Deployment).
Configure¶
The relevant environment variables (see Configuration §3):
OPENTALKING_REDIS_URL=redis://<redis-host>:6379/0
OPENTALKING_API_HOST=0.0.0.0
OPENTALKING_API_PORT=8000
OPENTALKING_WORKER_HOST=0.0.0.0
OPENTALKING_WORKER_PORT=9001
OPENTALKING_WORKER_URL=http://<worker-host>:9001
Run¶
The API and Worker processes are started separately:
Multiple Worker processes may be started across hosts; each Worker subscribes to the same Redis bus.
Scenario: production deployment¶
For single-host production deployments using source installation:
- Install OpenTalking and OmniRT as described in the appropriate hardware scenario.
- Configure
.envaccording to Configuration → Production deployment. - Wrap the relevant commands in a process manager. An example systemd unit:
[Unit]
Description=OpenTalking unified server
After=network.target redis.service
Requires=redis.service
[Service]
Type=simple
User=opentalking
WorkingDirectory=/opt/digital_human/opentalking
EnvironmentFile=/opt/digital_human/opentalking/.env
ExecStart=/opt/digital_human/opentalking/.venv/bin/opentalking-unified
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
- Configure the production checklist items documented in Deployment → Production checklist.
Updates¶
To update an existing source installation:
cd "$DIGITAL_HUMAN_HOME/opentalking"
git pull
uv sync --extra dev --python 3.11
source .venv/bin/activate
cd apps/web && npm ci && cd ../..
Database schema migrations are applied automatically at process startup.
Troubleshooting¶
| Symptom | Resolution |
|---|---|
ModuleNotFoundError: opentalking |
Activate the virtual environment with source .venv/bin/activate or run uv sync --extra dev --python 3.11; the fallback is pip install --index-url https://pypi.tuna.tsinghua.edu.cn/simple -e ".[dev]". |
ffmpeg: not found during TTS decoding |
Install ffmpeg. macOS: brew install ffmpeg. Debian/Ubuntu: apt install ffmpeg. |
torch.cuda.is_available() returns False |
Verify the NVIDIA driver, CUDA Toolkit, and that the installed torch package matches the CUDA version. |
OmniRT exits with CUDA out of memory |
Lower OPENTALKING_FLASHTALK_FRAME_NUM, OPENTALKING_FLASHTALK_SAMPLE_STEPS, or the output resolution. See Configuration → FlashTalk rendering parameters. |
npu-smi: command not found |
The CANN toolkit is not on the path. Source /usr/local/Ascend/ascend-toolkit/set_env.sh. |
| Port 8000 already in use | Override the bound port via --api-port on the start script or OPENTALKING_API_PORT in .env. |
Uninstallation¶
To remove a source installation:
cd "$DIGITAL_HUMAN_HOME"
rm -rf opentalking omnirt models
# Optional: remove the log and PID directories
rm -rf "$DIGITAL_HUMAN_HOME/logs" "$DIGITAL_HUMAN_HOME/run"
The local SQLite database referenced by OPENTALKING_SQLITE_PATH is also removed if
it resides under the repository.