Skip to content

OpenTalking

Overview

datascale-ai/opentalking

OpenTalking

datascale-ai/opentalking

Home
Home
- Quick Start
  Quick Start
- Usage
  Usage
  - Command Line Usage
    Command Line Usage
    
    Command Line Tools
    
    Advanced CLI Arguments
  - WebUI Usage
    WebUI Usage
    
    Basic Usage
    
    Custom Avatar
    
    Voice and TTS
    
    Video Clone
- Examples
  Examples
- Model Support
  Model Support
  - Model and Backend Selection
  - Local Audio + QuickTalk
  - Runtime Backends
    Runtime Backends
    
    Mock Backend
    
    Local Adapter
    
    Direct WebSocket
    
    OmniRT
  - Supported Models
    Supported Models
    
    Wav2Lip
    
    QuickTalk
    
    MuseTalk
    
    FlashTalk
    
    FlashHead
- Deployment Guide
  Deployment Guide
  - Support Matrix
  - Avatar Assets
  - Speech Models
    
    Speech Models
    
    Speech Recognition Models
    Speech Recognition Models
    
    Overview
    
    SenseVoice
    
    Speech Generation Models
    Speech Generation Models
    
    Overview
    
    CosyVoice
    
    IndexTTS
    
    Qwen3-TTS
  - Talking-Head Model Deployment
    
    Talking-Head Model Deployment
    
    Mock Backend
    
    QuickTalk
    QuickTalk
    
    Overview Overview
    Table of contents
    
    Support Status
    
    Benchmark Reference
    
    Choose a Deployment Mode
    
    Related Pages
    
    Local
    
    Apple Silicon
    
    OmniRT
    
    Wav2Lip
    Wav2Lip
    
    Overview
    
    Local
    
    OmniRT
    
    MuseTalk
    MuseTalk
    
    Overview
    
    Local
    
    OmniRT
    
    FasterLivePortrait
    
    FlashTalk
    
    FlashHead
  - Deployment Recipes
    
    Deployment Recipes
    
    Local Audio + QuickTalk
- Reference Materials
  Reference Materials
  - Benchmark
  - Changelog
- FAQ
  FAQ
  - FAQ
Quick Start
Quick Start
Usage
Usage
- Command Line Usage
  Command Line Usage
  - Command Line Tools
  - Advanced CLI Arguments
- WebUI Usage
  WebUI Usage
Examples
Examples
Model Support
Model Support
- Model and Backend Selection
- Local Audio + QuickTalk
- Runtime Backends
  Runtime Backends
- Supported Models
  Supported Models
  - Wav2Lip
  - QuickTalk
  - MuseTalk
  - FlashTalk
  - FlashHead
Deployment Guide
Deployment Guide
- Support Matrix
- Avatar Assets
- Speech Models
  Speech Models
  - Speech Recognition Models
    Speech Recognition Models
    
    Overview
    
    SenseVoice
  - Speech Generation Models
    Speech Generation Models
    
    Overview
    
    CosyVoice
    
    IndexTTS
    
    Qwen3-TTS
- Talking-Head Model Deployment
  Talking-Head Model Deployment
  - Mock Backend
  - QuickTalk
    QuickTalk
    
    Overview Overview
    Table of contents
    
    Support Status
    
    Benchmark Reference
    
    Choose a Deployment Mode
    
    Related Pages
    
    Local
    
    Apple Silicon
    
    OmniRT
  - Wav2Lip
    Wav2Lip
    
    Overview
    
    Local
    
    OmniRT
  - MuseTalk
    MuseTalk
    
    Overview
    
    Local
    
    OmniRT
  - FasterLivePortrait
  - FlashTalk
  - FlashHead
- Deployment Recipes
  Deployment Recipes
  - Local Audio + QuickTalk
Reference Materials
Reference Materials
- Benchmark
- Changelog
FAQ
FAQ
- FAQ

QuickTalk¶

QuickTalk is the realtime-oriented talking-head model path in OpenTalking. Use it for low-latency digital-human conversations and fast local GPU trials. This page is a mode-selection overview; weights, startup commands, and verification live in the deployment-mode pages below.

Support Status¶

Item	Value
Model ID	`quicktalk`
Backend	`local` / `omnirt`
Evidence level	Local adapter is built in and verified; OmniRT service path is documented
Best for	Realtime speaking avatars, low-latency validation, local or service-hosted inference

Benchmark Reference¶

The numbers below are summarized from Benchmark. Steady FPS is model-generation throughput, not full user-perceived latency; STT, LLM, TTS, queueing, and WebRTC still affect the complete experience.

Hardware	Backend	Output	Steady FPS	First-turn total/ms	TTFV/ms	Peak inference VRAM/GB
RTX 3090	OmniRT	540×900 / 25fps	29.23	3356.019	1800.524	1.662
RTX 4090	OmniRT	540×900 / 25fps	46.921	2561.146	1064.825	1.838
NPU 910B2	OmniRT	540×900 / 25fps	29.66	3212.053	1782.861	2.473
RTX 3050 Laptop	OmniRT	306×512 / 25fps	20.695	4243.26	2661	1.396

Choose a Deployment Mode¶

Mode	Best for	Entry
Local	Single-machine CUDA, in-process adapter, fastest real-chain validation	QuickTalk Local Deployment
Apple Silicon	Weight, manifest, and WebUI flow checks on macOS	QuickTalk Apple Silicon Deployment
OmniRT	Isolating inference from OpenTalking, or sharing one model endpoint across runtimes	QuickTalk OmniRT Deployment

Support Matrix: compare QuickTalk with other model-chain backends.
Avatar Assets: understand shared avatar assets and session selection.
Local Audio + QuickTalk: full local SenseVoice, CosyVoice, and QuickTalk chain.