Installation¶
OpenTalking ships two installation methods. Selecting the appropriate method is a function of two questions: where the work happens (development, single-machine production, multi-machine production) and what hardware is available (CPU only, NVIDIA GPU, or Ascend NPU).
This page presents the decision matrix and links to the detailed instructions. For a streamlined first-run procedure, see the Quickstart.
Choosing an installation method¶
| Use case | Hardware | Recommended method | Detailed guide |
|---|---|---|---|
| Local development, frontend changes, API iteration | Any | Source install + mock synthesis | From source |
| CPU evaluation | CPU | Source install + mock synthesis | From source |
| Evaluation on a single GPU machine | NVIDIA 3090 / 4090 / A100 (CUDA 12) | Source install + model-specific backend | From source → single GPU |
| Evaluation on Ascend NPU | Huawei 910B (CANN 8.0+) | Source install on the host CANN environment | From source → Ascend 910B |
| Continuous integration | CPU | Source install or Docker Compose, depending on reproducibility needs | From source or Docker Compose → CPU profile |
| Production single-host deployment | Linux + GPU or NPU | Source install or Docker, depending on operations preference | From source → Production or Docker Compose |
| Production multi-host deployment with horizontal Worker scaling | Linux + GPU or NPU | Source install, API/Worker split, external Redis | From source → API and Worker split and Deployment |
Platform support matrix¶
| Platform | Synthesis backends | Notes |
|---|---|---|
| macOS (Apple Silicon and Intel) | mock |
Suitable for orchestration and frontend development. Real talking-head models are not supported on macOS. |
| Linux x86_64 + CUDA 12 | mock, wav2lip, musetalk, flashtalk, flashhead, quicktalk |
Primary deployment target. |
| Linux aarch64 + Ascend 910B (CANN 8.0+) | mock, wav2lip, flashtalk |
NPU production deployment path. |
| Windows | mock (WSL2 recommended) |
Not part of the continuous integration matrix. |
Common prerequisites¶
Independent of the installation method, the following components are required:
- A DashScope (Bailian) API key for the default language model (
qwen-flash) and speech recognition (paraformer-realtime-v2). Other OpenAI-compatible endpoints may be used; see Configuration §1. - WebRTC-compatible client. The bundled frontend has been tested against Chromium-based browsers. Safari requires additional CORS configuration.
Source-installation additional requirements:
- Python 3.10 or later (3.11 recommended).
- Node.js 18 or later for the frontend toolchain.
- ffmpeg for the text-to-speech decoding stage.
- Optionally Redis 6 or later for the API/Worker split deployment.
Docker Compose is a deployment packaging option, not the lightest evaluation path. Use it when repeatable images, containerized service boundaries, or production-like operations are more important than first-run simplicity.
Docker-installation additional requirements:
- Docker Engine 20.10 or later and the Compose v2 plugin.
- NVIDIA Container Toolkit when running the GPU profile.
Verification¶
Regardless of the installation method, the orchestrator can be verified with the following requests once it is running:
curl -s http://127.0.0.1:8000/health
# {"status":"ok"}
curl -s http://127.0.0.1:8000/models | jq
# Lists available synthesis backends.
Next steps¶
- From source — install from a git checkout. Covers development, production, and Ascend variants.
- Docker Compose — install with the packaged Docker stack for reproducible deployments.
- Configuration — required environment configuration after installation.
- Deployment — selecting a runtime topology.