Skip to content

SenseVoice Local Deployment

SenseVoiceSmall is the recommended local speech-recognition model for OpenTalking. Use it for private deployments, short realtime utterances, and the local audio + QuickTalk path.

Use Cases

  • Microphone audio should not be sent to an external STT service.
  • Short realtime recognition should run on CPU.
  • STT, TTS, and QuickTalk local need to run on one machine for validation.

Weight Preparation

Terminal
cd "$OPENTALKING_HOME"
uv sync --extra dev --extra models --extra local-audio --python 3.11

python scripts/download_local_audio_models.py \
  --root ./avatar_models/local-audio \
  --model sensevoice-small

Configuration

.env
OPENTALKING_STT_DEFAULT_PROVIDER=sensevoice
OPENTALKING_STT_ENABLED_PROVIDERS=sensevoice,dashscope
OPENTALKING_STT_SENSEVOICE_MODEL=iic/SenseVoiceSmall
OPENTALKING_STT_SENSEVOICE_MODEL_DIR=./avatar_models/local-audio/iic__SenseVoiceSmall
OPENTALKING_STT_SENSEVOICE_DEVICE=cpu

Start Command

Terminal
cd "$OPENTALKING_HOME"
bash scripts/start_unified.sh --backend mock --model mock --api-port 8000 --web-port 5173

Verification

Terminal
curl -fsS http://127.0.0.1:8000/health
curl -s http://127.0.0.1:8000/api/runtime/status | jq

Then select microphone input in the WebUI and confirm that STT results and LLM responses appear in the event stream.

Common Errors

Symptom Action
Model directory not found Check that OPENTALKING_STT_SENSEVOICE_MODEL_DIR points to the downloaded directory.
Recognition latency is high Validate short utterances on CPU first; use a dedicated STT service for long audio or high concurrency.
API STT key errors Local SenseVoice does not read the DashScope key; confirm that the frontend selected local STT.