Wav2Lip is the recommended first real lip-sync model path in OpenTalking. It is lighter than heavyweight talking-head models and is useful when moving from mock to real video output and testing the end-to-end audio-driven video chain.
The numbers below are summarized from Benchmark. Steady FPS is model-generation throughput, not full user-perceived latency; STT, LLM, TTS, queueing, and WebRTC still affect the complete experience.