Skip to content

Installation

OpenTalking ships two installation methods. Selecting the appropriate method is a function of two questions: where the work happens (development, single-machine production, multi-machine production) and what hardware is available (CPU only, NVIDIA GPU, or Ascend NPU).

This page presents the decision matrix and links to the detailed instructions. For a streamlined first-run procedure, see the Quickstart.

Choosing an installation method

Use case Hardware Recommended method Detailed guide
Local development, frontend changes, API iteration Any Source install + mock synthesis From source
CPU evaluation CPU Source install + mock synthesis From source
Evaluation on a single GPU machine NVIDIA 3090 / 4090 / A100 (CUDA 12) Source install + model-specific backend From source → single GPU
Evaluation on Ascend NPU Huawei 910B (CANN 8.0+) Source install on the host CANN environment From source → Ascend 910B
Continuous integration CPU Source install or Docker Compose, depending on reproducibility needs From source or Docker Compose → CPU profile
Production single-host deployment Linux + GPU or NPU Source install or Docker, depending on operations preference From source → Production or Docker Compose
Production multi-host deployment with horizontal Worker scaling Linux + GPU or NPU Source install, API/Worker split, external Redis From source → API and Worker split and Deployment

Platform support matrix

Platform Synthesis backends Notes
macOS (Apple Silicon and Intel) mock, experimental quicktalk local on Apple Silicon Suitable for orchestration and frontend development. QuickTalk local can be tested on Apple Silicon with quicktalk-cpu; see QuickTalk on Apple Silicon for the full path. Realtime production paths still target Linux GPU/NPU or OmniRT.
Linux x86_64 + CUDA 12 mock, wav2lip, musetalk, flashtalk, flashhead, quicktalk Primary deployment target.
Linux aarch64 + Ascend 910B (CANN 8.0+) mock, wav2lip, flashtalk NPU production deployment path.
Windows mock (WSL2 recommended) Not part of the continuous integration matrix.

Common prerequisites

Independent of the installation method, the following components are required:

  • A DashScope (Bailian) API key for the default language model (qwen-flash) and speech recognition (paraformer-realtime-v2). Other OpenAI-compatible endpoints may be used; see Configuration §1.
  • WebRTC-compatible client. The bundled frontend has been tested against Chromium-based browsers. Safari requires additional CORS configuration.

Source-installation additional requirements:

  • Python 3.10 or later (3.11 recommended).
  • Node.js 18 or later for the frontend toolchain.
  • ffmpeg for the text-to-speech decoding stage.
  • Optionally Redis 6 or later for the API/Worker split deployment.

Docker Compose is a deployment packaging option, not the lightest evaluation path. Use it when repeatable images, containerized service boundaries, or production-like operations are more important than first-run simplicity.

Docker-installation additional requirements:

  • Docker Engine 20.10 or later and the Compose v2 plugin.
  • NVIDIA Container Toolkit when running the GPU profile.

Verification

Regardless of the installation method, the orchestrator can be verified with the following requests once it is running:

terminal
curl -s http://127.0.0.1:8000/health
# {"status":"ok"}

curl -s http://127.0.0.1:8000/models | jq
# Lists available synthesis backends.

Next steps

  • From source — install from a git checkout. Covers development, production, and Ascend variants.
  • Docker Compose — install with the packaged Docker stack for reproducible deployments.
  • Configuration — required environment configuration after installation.
  • Deployment — selecting a runtime topology.