Support Matrix¶

This page summarizes what is currently built into OpenTalking, what is only documented as an integration path, and what has repo-grounded validation evidence. Use it as the decision page before following the deeper setup guides.

How to read the matrix¶

Status	Meaning
Built-in	The capability is implemented directly in this repository and can be selected in the current product surface.
Documented	The integration path is documented, but runtime availability still depends on your external service or weights.
Validated	The repository docs or tests include concrete validation evidence for the path.
Planned	The architectural boundary exists, but the local runtime is not bundled yet.

End-to-End Capability Matrix¶

Layer	Option	Integration shape	Default / recommendation	Status	Notes
LLM	DashScope `qwen-flash`	OpenAI-compatible endpoint	Default first-run path	Built-in, Validated	This is the repo's default quickstart path.
LLM	OpenAI-compatible endpoints	`OPENTALKING_LLM_BASE_URL`	Use when already standard in your environment	Built-in, Documented	Covers OpenAI, vLLM, Ollama, DeepSeek, and similar servers.
STT	DashScope Paraformer realtime	Provider adapter	Default microphone path	Built-in, Validated	Required for the default voice-input flow.
STT	SenseVoiceSmall	Local FunASR adapter	Local speech-input path	Built-in, Validated	CPU-capable and suitable for short realtime utterances.
TTS	Edge TTS	Local provider adapter	Default first-run path	Built-in, Validated	Lightest path; no API key required.
TTS	DashScope Qwen realtime TTS	Provider adapter	Recommended when you want hosted Chinese realtime TTS	Built-in, Documented	Also used for voice-cloning-related workflows.
TTS	Local CosyVoice3 0.5B	Local CosyVoice service / adapter	Local voice and cloning path	Built-in, Validated	Uses `local_cosyvoice`; the standalone service is recommended.
TTS	CosyVoice service	Provider adapter / remote service	Use for custom voice service deployments	Built-in, Documented	Requires a reachable CosyVoice service and, in some flows, `OPENTALKING_PUBLIC_BASE_URL`.
TTS	ElevenLabs	Provider adapter	Use for hosted multilingual voices	Built-in, Documented	Requires API key and voice id.
Avatar	Built-in example avatars	Local asset bundles	Default first-run path	Built-in, Validated	Good for `mock`, Wav2Lip, and other documented flows.
Avatar	Custom uploaded portraits	`/avatars/custom`	Use when you want quick custom avatars	Built-in, Documented	Best compatibility today is with Wav2Lip-style image avatars.
Avatar	Model-specific manifests	Local asset bundles	Required for QuickTalk / FlashHead / FlashTalk matching	Built-in, Documented	`model_type` must match the selected synthesis model.

Talking-Head Model Matrix¶

Model	Backend choices	Repo default	Validation level	Recommended hardware path	Current guidance
`mock`	`mock`	`mock`	Built-in, Validated	CPU	Fastest full-pipeline self-test; no model weights.
`wav2lip`	`local`, `omnirt`, `direct_ws`	`local`	Local adapter is built in and covered by tests; OmniRT compatibility path is documented	CPU-capable; OmniRT compatibility path uses a single GPU or Ascend 910B	Best first lightweight talking-head validation path.
`musetalk`	`omnirt`, `direct_ws`, `local`	`omnirt`	Local adapter is built in and runs official preprocessing before session initialization; OmniRT and direct WebSocket paths remain documented	Single GPU or remote model service	Use `local` for single-machine validation when weights and OpenMMLab preprocessing dependencies are installed; use OmniRT for service isolation.
`quicktalk`	`local`, `omnirt`	`omnirt`	The local adapter is built in and validated in the real chain; the OmniRT compatibility path remains documented	CUDA GPU	Use `--backend local` for the single-machine path; use OmniRT when service isolation is needed.
`fasterliveportrait`	`omnirt`	`omnirt`	Documented	Single CUDA GPU with TensorRT	Realtime JoyVASA audio driving plus FasterLivePortrait pasteback through OmniRT `/v1/audio2video/fasterliveportrait`.
`flashtalk`	`omnirt`, legacy `direct_ws` fallback	`omnirt`	OmniRT path documented, Ascend path validated	4090-class GPU or Ascend 910B multi-card	High-quality path for heavyweight deployment.
`flashhead`	`direct_ws`	`direct_ws`	Documented	External FlashHead service	OpenTalking acts as the orchestrator and client, not the model host.

Backend Behavior Matrix¶

Backend	What OpenTalking expects	Connected when	Typical models
`mock`	No external runtime	Always	`mock`
`local`	In-process adapter/runtime	The adapter imports and dependencies are satisfied	`wav2lip`, `quicktalk`, `musetalk`
`direct_ws`	Model-specific remote service	A model-specific WebSocket URL is configured	`flashhead`, custom single-model services
`omnirt`	OmniRT `/v1/audio2video/{model}`	OmniRT is reachable and reports the model	`wav2lip`, `musetalk`, `quicktalk`, `fasterliveportrait`, `flashtalk`

Validation Notes¶

Path	Evidence in the repo/docs
`mock`	Quickstart and `/models` examples show the full self-test path.
`wav2lip + local`	Built-in adapter registration, `/models` `reason=local_runtime`, and local render tests.
`musetalk + local`	Built-in adapter registration, local MuseTalk tests, and official avatar preprocessing before session initialization.
`wav2lip + omnirt`	Startup scripts and `/models` status semantics remain documented for the checkpoint-backed compatibility path.
`sensevoice + local_cosyvoice + quicktalk local`	Local STT/TTS providers, the QuickTalk local adapter, frontend provider selection, and custom-avatar flow are covered by tests or real-chain validation.
`quicktalk + omnirt`	Retained as a compatible service-hosted path; prefer `quicktalk + local` for single-machine deployment.
`fasterliveportrait + omnirt`	The FasterLivePortrait guide covers JoyVASA/chinese-hubert-base checkpoints, TensorRT startup, `/v1/audio2video/fasterliveportrait`, frontend controls, and hot updates.
`flashtalk + omnirt`	Documented startup scripts, legacy fallback behavior, and README validation notes for Ascend 910B2 x8.
`flashhead + direct_ws`	Configured integration path plus the `/models` `reason=direct_ws` example in the talking-head guide.

Recommended First Paths¶

Use mock to validate the browser, API, LLM, STT, TTS, and WebRTC path.
Use local wav2lip when you want the lightest talking-head validation path.
Use Local STT/TTS + QuickTalk when you want local speech input, local speech synthesis, and QuickTalk realtime video.
Use local musetalk when you want MuseTalk quality on one CUDA machine and can install the preprocessing dependencies.
Use QuickTalk Local for single-machine realtime audio2video on CUDA, or QuickTalk with OmniRT for service isolation.
Use fasterliveportrait when you want realtime audio-driven portrait pasteback on a single CUDA GPU.
Use flashtalk when quality matters more than deployment weight.
Use flashhead only when you already operate a FlashHead service.

Next Pages¶

Frontend Entry¶

After the model or backend service is running, use the OpenTalking WebUI:

Terminal

cd "$OPENTALKING_HOME"
bash scripts/quickstart/start_frontend.sh --api-port 8000 --web-port 5173 --host 0.0.0.0

For a remote server, forward your local browser port to the server 5173, then open http://127.0.0.1:5173.