FasterLivePortrait / JoyVASA¶

Support Status¶

Item	Value
Model ID	`fasterliveportrait`
Backend	`omnirt`
Evidence level	Documented; realtime path exposed through the OmniRT runtime
Best for	Single-GPU realtime audio-driven portrait avatars, original-image pasteback, frontend amplitude hot updates

Common Errors¶

Symptom	Action
`/models` shows `runtime_not_enabled`	Ensure OmniRT was started with `OMNIRT_FASTLIVEPORTRAIT_RUNTIME=1`, then check checkpoint paths and `logs/omnirt`.
Audio driving has no lip motion	Check `JoyVASA/motion_generator`, `JoyVASA/motion_template`, and `chinese-hubert-base/pytorch_model.bin`.
Generation reports an ONNXRuntime `GridSample` error	Re-run `uv sync --extra server --extra fasterliveportrait --python 3.11`, confirm `import tensorrt` works, and start with `OMNIRT_FASTLIVEPORTRAIT_CFG=configs/trt_infer.yaml`.
Browser sees the model but session creation fails	Select an avatar whose `model_type` matches `fasterliveportrait`, or prepare a matching avatar bundle.

FasterLivePortrait also runs through the OmniRT audio2video compatibility path. OpenTalking owns sessions, TTS/audio streaming, WebRTC playback, and frontend parameter updates. OmniRT keeps FasterLivePortrait and JoyVASA resident and exposes /v1/audio2video/fasterliveportrait.

This path is intended for single-GPU realtime avatars. The default live profile uses 25fps, one-second audio chunks, a 448px width, and pasteback into the original avatar image. Full-body uploads are still driven through the detected face region; body motion is not synthesized by this runtime.

1. Prepare code and weights¶

You need a FasterLivePortrait source checkout and a real checkpoint directory. If you do not want symlinks, copy or download the files directly into the model root.

terminal

if [ ! -d "$FASTERLIVEPORTRAIT_HOME/.git" ]; then
  git clone https://github.com/KlingAIResearch/LivePortrait.git "$FASTERLIVEPORTRAIT_HOME"
fi

mkdir -p "$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints"

The checkpoint directory must include at least:

$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints/
  JoyVASA/
    motion_generator/motion_generator_hubert_chinese.pt
    motion_template/motion_template.pkl
  chinese-hubert-base/
    config.json
    preprocessor_config.json
    pytorch_model.bin
  liveportrait/ or appearance_feature_extractor.onnx and the other FasterLivePortrait ONNX/TRT files

If the model files already exist elsewhere, copy real files with rsync:

terminal

rsync -a /path/to/FasterLivePortrait/checkpoints/ \
  "$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints/"

Preflight check:

terminal

test -f "$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints/JoyVASA/motion_generator/motion_generator_hubert_chinese.pt"
test -f "$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints/JoyVASA/motion_template/motion_template.pkl"
test -f "$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints/chinese-hubert-base/pytorch_model.bin"

2. Prepare the OmniRT environment¶

terminal

cd "$OMNIRT_HOME"
export UV_DEFAULT_INDEX="${UV_DEFAULT_INDEX:-https://pypi.tuna.tsinghua.edu.cn/simple}"
export UV_CACHE_DIR="${UV_CACHE_DIR:-$DIGITAL_HUMAN_HOME/.uv-cache}"
uv sync --extra server --extra fasterliveportrait --python 3.11

The realtime FasterLivePortrait path uses TensorRT by default. The fasterliveportrait extra installs onnxruntime-gpu, tensorrt-cu12, tensorrt-cu12-bindings, and tensorrt-cu12-libs. The TensorRT libs wheel is about 4GB, so keep UV_CACHE_DIR on a data disk with enough space; do not let it fall back to a small /root/.cache/uv.

Before deployment, verify that uv run python -c "import tensorrt as trt; print(trt.__version__)" prints a version.

3. Start the OmniRT FasterLivePortrait runtime¶

terminal

cd "$OMNIRT_HOME"
OMNIRT_FASTLIVEPORTRAIT_RUNTIME=1 \
OMNIRT_FASTLIVEPORTRAIT_LOAD_MODELS=1 \
OMNIRT_FASTLIVEPORTRAIT_ROOT="$FASTERLIVEPORTRAIT_HOME" \
OMNIRT_FASTLIVEPORTRAIT_CHECKPOINTS_DIR="$OMNIRT_MODEL_ROOT/FasterLivePortrait/checkpoints" \
OMNIRT_FASTLIVEPORTRAIT_CFG=configs/trt_infer.yaml \
OMNIRT_FASTLIVEPORTRAIT_DEVICE=cuda:0 \
OMNIRT_FASTLIVEPORTRAIT_JPEG_QUALITY=85 \
uv run omnirt serve-avatar-ws --host 0.0.0.0 --port 9000 --backend cuda

Verify OmniRT reports the model:

terminal

curl -s http://127.0.0.1:9000/v1/audio2video/models | jq '.statuses[] | select(.id=="fasterliveportrait")'

Expected status:

{"id":"fasterliveportrait","connected":true,"reason":"fasterliveportrait_runtime"}

4. Configure and start OpenTalking¶

OpenTalking configures fasterliveportrait as backend: omnirt by default. The realtime profile lives in configs/synthesis/fasterliveportrait.yaml; common defaults are:

configs/synthesis/fasterliveportrait.yaml

width: 448
fps: 25
chunk_samples: 16000
emit_frames_per_chunk: 25
head_motion_multiplier: 0.3
pose_motion_multiplier: 0.35
yaw_multiplier: 0.85
pitch_multiplier: 1.0
roll_multiplier: 0.85
animation_region: lip
expression_multiplier: 1.0
mouth_open_multiplier: 1.25
mouth_corner_multiplier: 0.85
cheek_jaw_multiplier: 0.9
driving_multiplier: 1.0
cfg_scale: 4.0
flag_relative_motion: true
flag_stitching: true
head_only_pasteback: false

Start OpenTalking against OmniRT:

terminal

cd "$OPENTALKING_HOME"
OMNIRT_ENDPOINT=http://127.0.0.1:9000 \
OPENTALKING_OMNIRT_ENDPOINT=http://127.0.0.1:9000 \
uv run opentalking-unified --host 0.0.0.0 --port 8000

Frontend:

terminal

cd "$OPENTALKING_HOME/apps/web"
npm ci
VITE_BACKEND_PORT=8000 npm run dev -- --host 0.0.0.0 --port 5173

Verify OpenTalking sees the model:

terminal

curl -s http://127.0.0.1:8000/models | jq '.statuses[] | select(.id=="fasterliveportrait")'

Expected status:

{"id":"fasterliveportrait","backend":"omnirt","connected":true,"reason":"omnirt"}

5. Frontend controls and hot updates¶

After selecting FasterLivePortrait, the frontend shows a parameter panel. Before a session starts, clicking Apply stores values for the next session. During a session, clicking Apply sends a hot update and takes effect on the next audio chunk without restarting the conversation.

Parameter	Effect	Suggested range
`head_motion_multiplier`	Overall head motion amplitude	default 0.3, common 0.2-0.8
`pose_motion_multiplier`	pitch/yaw/roll amplitude; lower this first when the head sways too much	0.2-0.5
`yaw_multiplier`	Left/right head turn amplitude	default 0.85, common 0.6-1.0
`pitch_multiplier`	Up/down nod amplitude	default 1.0, common 0.7-1.1
`roll_multiplier`	Side tilt amplitude	default 0.85, common 0.6-1.0
`animation_region`	FLP animation region; realtime defaults to mouth-only to reduce wide eyes and exaggerated full-face motion	default `lip`; use `all` for full expression
`expression_multiplier`	Overall expression and lip amplitude	default 1.0, common 0.9-1.2
`mouth_open_multiplier`	Mouth opening amplitude	default 1.25, common 1.0-1.4
`mouth_corner_multiplier`	Mouth-corner movement	default 0.85, common 0.7-1.0
`cheek_jaw_multiplier`	Cheek and jaw movement	default 0.9, common 0.7-1.1
`driving_multiplier`	Overall keypoint driving amplitude	0.8-1.2
`cfg_scale`	JoyVASA audio-following strength	default 4.0, common 3.5-4.5

Start with head_motion_multiplier=0.3, pose_motion_multiplier=0.35, yaw_multiplier=0.85, roll_multiplier=0.85, animation_region=lip, expression_multiplier=1.0, mouth_open_multiplier=1.25, mouth_corner_multiplier=0.85, cheek_jaw_multiplier=0.9, cfg_scale=4.0, and keep flag_relative_motion=true. If the head sways left/right, lower yaw_multiplier to 0.7. If the mouth looks pursed or the smile is too strong, lower mouth_corner_multiplier to 0.75. Switch the region from lip to all only when you need richer facial expression. Do not improve speed by dropping mouth-open frames.

6. Performance check¶

terminal

cd "$OMNIRT_HOME"
uv run python scripts/bench_fasterliveportrait_ws.py \
  --url ws://127.0.0.1:9000/v1/audio2video/fasterliveportrait \
  --duration 30 \
  --chunk-samples 16000

For single-GPU realtime use, watch first packet latency, per-chunk render time, output fps, and whether the browser queue keeps growing. If 448px width cannot stay above 25fps, drop to 416px. Use 480px or 540px only for quality-first runs, not as the realtime default.