FlashTalk¶
Support Status¶
| Item | Value |
|---|---|
| Model ID | flashtalk |
| Backend | omnirt, with legacy direct_ws fallback |
| Evidence level | OmniRT path documented; Ascend path has validation records |
| Best for | High-quality private deployment, heavy models, multi-card GPU/NPU |
Recommended Hardware¶
CUDA evaluation should use a 4090/A100-class GPU. Ascend 910B deployment should run in the host CANN environment.
Weights¶
Primary Hugging Face sources:
Terminal
hf download Soul-AILab/SoulX-FlashTalk-14B --local-dir "$OMNIRT_MODEL_ROOT/SoulX-FlashTalk-14B"
hf download TencentGameMate/chinese-wav2vec2-base --local-dir "$OMNIRT_MODEL_ROOT/chinese-wav2vec2-base"
For domestic mirrors, search ModelScope or Modelers for SoulX-FlashTalk-14B and
chinese-wav2vec2-base.
Directory Layout¶
$OMNIRT_MODEL_ROOT/
├── SoulX-FlashTalk-14B/
├── chinese-wav2vec2-base/
└── SoulX-FlashTalk/ # Optional for custom CUDA/manual paths
Configuration¶
Legacy WebSocket fallback:
Start¶
CUDA:
Terminal
bash scripts/quickstart/start_omnirt_flashtalk.sh --device cuda --nproc 1
bash scripts/quickstart/start_all.sh --omnirt http://127.0.0.1:9000
Ascend:
Terminal
source /usr/local/Ascend/ascend-toolkit/set_env.sh
bash scripts/quickstart/start_omnirt_flashtalk.sh --device npu --nproc 8
/models Verification¶
Expected:
Common Errors¶
| Symptom | Action |
|---|---|
| Cold startup is slow | Inspect OmniRT/FlashTalk logs and separate dependency setup, weight load, and worker init. |
| CUDA OOM | Lower OPENTALKING_FLASHTALK_FRAME_NUM, OPENTALKING_FLASHTALK_SAMPLE_STEPS, or resolution. |
| NPU import failure | Confirm CANN is sourced and torch_npu, driver, and CANN versions match. |
reason=not_configured |
Configure OMNIRT_ENDPOINT or run start_all.sh --omnirt .... |