Skip to content

OpenTalking

Overview

datascale-ai/opentalking

OpenTalking

datascale-ai/opentalking

Home
Home
- Quick Start
  Quick Start
- Usage
  Usage
  - Command Line Usage
    Command Line Usage
    
    Command Line Tools
    
    Advanced CLI Arguments
  - WebUI Usage
    WebUI Usage
    
    Basic Usage
    
    Custom Avatar
    
    Voice and TTS
    
    Video Clone
- Examples
  Examples
- Model Support
  Model Support
  - Model and Backend Selection
  - Local Audio + QuickTalk
  - Runtime Backends
    Runtime Backends
    
    Mock Backend
    
    Local Adapter
    
    Direct WebSocket
    
    OmniRT
  - Supported Models
    Supported Models
    
    Wav2Lip
    
    QuickTalk
    
    MuseTalk
    
    FlashTalk
    
    FlashHead
- Deployment Guide
  Deployment Guide
  - Support Matrix
  - Avatar Assets
  - Speech Models
    
    Speech Models
    
    Speech Recognition Models
    Speech Recognition Models
    
    Overview
    
    SenseVoice
    
    Speech Generation Models
    Speech Generation Models
    
    Overview
    
    CosyVoice
    
    IndexTTS
    
    Qwen3-TTS
  - Talking-Head Model Deployment
    
    Talking-Head Model Deployment
    
    Mock Backend
    
    QuickTalk
    QuickTalk
    
    Overview
    
    Local
    
    Apple Silicon
    
    OmniRT
    
    Wav2Lip
    Wav2Lip
    
    Overview
    
    Local
    
    OmniRT
    
    MuseTalk
    MuseTalk
    
    Overview Overview
    Table of contents
    
    Support Status
    
    Benchmark Reference
    
    Choose a Deployment Mode
    
    When to Choose Another Model
    
    Related Pages
    
    Local
    
    OmniRT
    
    FasterLivePortrait
    
    FlashTalk
    
    FlashHead
  - Deployment Recipes
    
    Deployment Recipes
    
    Local Audio + QuickTalk
- Reference Materials
  Reference Materials
  - Benchmark
  - Changelog
- FAQ
  FAQ
  - FAQ
Quick Start
Quick Start
Usage
Usage
- Command Line Usage
  Command Line Usage
  - Command Line Tools
  - Advanced CLI Arguments
- WebUI Usage
  WebUI Usage
Examples
Examples
Model Support
Model Support
- Model and Backend Selection
- Local Audio + QuickTalk
- Runtime Backends
  Runtime Backends
- Supported Models
  Supported Models
  - Wav2Lip
  - QuickTalk
  - MuseTalk
  - FlashTalk
  - FlashHead
Deployment Guide
Deployment Guide
- Support Matrix
- Avatar Assets
- Speech Models
  Speech Models
  - Speech Recognition Models
    Speech Recognition Models
    
    Overview
    
    SenseVoice
  - Speech Generation Models
    Speech Generation Models
    
    Overview
    
    CosyVoice
    
    IndexTTS
    
    Qwen3-TTS
- Talking-Head Model Deployment
  Talking-Head Model Deployment
  - Mock Backend
  - QuickTalk
    QuickTalk
    
    Overview
    
    Local
    
    Apple Silicon
    
    OmniRT
  - Wav2Lip
    Wav2Lip
    
    Overview
    
    Local
    
    OmniRT
  - MuseTalk
    MuseTalk
    
    Overview Overview
    Table of contents
    
    Support Status
    
    Benchmark Reference
    
    Choose a Deployment Mode
    
    When to Choose Another Model
    
    Related Pages
    
    Local
    
    OmniRT
  - FasterLivePortrait
  - FlashTalk
  - FlashHead
- Deployment Recipes
  Deployment Recipes
  - Local Audio + QuickTalk
Reference Materials
Reference Materials
- Benchmark
- Changelog
FAQ
FAQ
- FAQ

MuseTalk¶

MuseTalk is the higher-quality video-avatar lip-sync path in OpenTalking. Compared with Wav2Lip, it has heavier dependencies and preprocessing; compared with QuickTalk, it is more quality-oriented and useful when you already have a MuseTalk runtime. This page explains when to choose MuseTalk and which deployment mode to use.

Support Status¶

Item	Value
Model ID	`musetalk`
Backend	`local` / `omnirt` / `direct_ws`
Evidence level	Local adapter is wired; local mode runs official MuseTalk preprocessing before session initialization
Best for	Higher-quality lip sync, video avatars, existing MuseTalk runtimes

Benchmark Reference¶

The numbers below are summarized from Benchmark. Steady FPS is model-generation throughput, not full user-perceived latency; STT, LLM, TTS, queueing, and WebRTC still affect the complete experience.

Hardware	Backend	Output	Steady FPS	First-turn total/ms	TTFV/ms	Peak inference VRAM/GB
RTX 3090	OmniRT	512×512 / 25fps	28.868	3235.518	1769.484	5.078
RTX 4090	OmniRT	512×512 / 25fps	24.767	3605.564	2095.522	5.203
NPU 910B2	OmniRT	512×512 / 25fps	12.276	5781.453	4211.721	8.754

Choose a Deployment Mode¶

Mode	Best for	Entry
Local	Single-machine CUDA, with OpenTalking running official avatar preprocessing	MuseTalk Local Deployment
OmniRT	Isolating MuseTalk dependencies from the main OpenTalking process	MuseTalk OmniRT Deployment
Direct WebSocket	Connecting an existing MuseTalk-compatible service directly	See Runtime Backends

When to Choose Another Model¶

Need the lightest real lip-sync validation: see Wav2Lip.
Need lower-latency realtime speaking: see QuickTalk.
Need a high-quality heavyweight service path: see FlashTalk.