API Reference¶
OpenTalking exposes a REST, Server-Sent Events, and WebSocket interface defined in
apps/api/routes/. The interface is organized into the following groups:
| Group | Purpose | Documentation |
|---|---|---|
| Health and Models | Liveness probes, queue introspection, and capability discovery. | Health and Models |
| Avatars | Avatar bundle catalog and custom avatar upload. | Avatars |
| Sessions | Session lifecycle, conversational interaction, WebRTC signaling. | Sessions |
| TTS and Voices | One-off TTS preview and cloned voice management. | TTS and Voices |
| Events and Streaming | Server-Sent Events stream and audio WebSocket protocol. | Events and Streaming |
Base URL and authentication¶
The default base URL is http://localhost:8000. Production deployments terminate TLS
and authentication at an upstream reverse proxy. OpenTalking itself does not require
authentication for its routes; deployments that expose the API publicly should
implement authentication at the gateway layer.
The OPENTALKING_CORS_ORIGINS environment variable controls which origins may issue
cross-origin requests; see Configuration §3.
Request and response conventions¶
Unless otherwise noted, request and response bodies use application/json. Multipart
uploads (multipart/form-data) are used by avatar and voice-cloning endpoints.
- Successful responses return HTTP
200with a JSON body. - Validation errors return HTTP
400with a{"detail": "..."}payload. - Missing resources return HTTP
404. - Authentication and authorization errors raised by upstream services (DashScope, OmniRT) are translated to HTTP
502with the upstream error text indetail. - Server-side errors return HTTP
500with a generic message; the actual exception is logged.
Identifier formats:
session_id— UUID4 string assigned at session creation.avatar_id— slug-format string (alphanumeric, hyphen, underscore, CJK characters), as defined by the avatar'smanifest.json.voice entry_id— integer primary key from the SQLite voice catalog.job_id— UUID4 string assigned to a FlashTalk offline bundle job.
Endpoint summary¶
| Method | Path | Group |
|---|---|---|
GET |
/health |
Health and Models |
GET |
/healthz |
Health and Models |
GET |
/queue/status |
Health and Models |
GET |
/models |
Health and Models |
GET |
/avatars |
Avatars |
GET |
/avatars/{avatar_id} |
Avatars |
GET |
/avatars/{avatar_id}/preview |
Avatars |
POST |
/avatars/custom |
Avatars |
DELETE |
/avatars/{avatar_id} |
Avatars |
POST |
/sessions |
Sessions |
POST |
/sessions/customize |
Sessions |
POST |
/sessions/customize/prompt |
Sessions |
POST |
/sessions/customize/reference |
Sessions |
GET |
/sessions/{session_id} |
Sessions |
POST |
/sessions/{session_id}/start |
Sessions |
POST |
/sessions/{session_id}/speak |
Sessions |
POST |
/sessions/{session_id}/chat |
Sessions |
POST |
/sessions/{session_id}/transcribe |
Sessions |
POST |
/sessions/{session_id}/speak_audio |
Sessions |
POST |
/sessions/{session_id}/speak_flashtalk_audio |
Sessions |
POST |
/sessions/{session_id}/interrupt |
Sessions |
POST |
/sessions/{session_id}/webrtc/offer |
Sessions |
POST |
/sessions/{session_id}/flashtalk-recording/start |
Sessions |
POST |
/sessions/{session_id}/flashtalk-recording/stop |
Sessions |
GET |
/sessions/{session_id}/flashtalk-recording |
Sessions |
POST |
/sessions/{session_id}/flashtalk-offline-bundle |
Sessions |
GET |
/sessions/{session_id}/flashtalk-offline-bundle/{job_id} |
Sessions |
GET |
/sessions/{session_id}/flashtalk-offline-bundle/{job_id}/download |
Sessions |
DELETE |
/sessions/{session_id} |
Sessions |
WS |
/sessions/{session_id}/speak_audio_stream |
Events and Streaming |
GET |
/sessions/{session_id}/events |
Events and Streaming |
POST |
/tts/preview |
TTS and Voices |
GET |
/voices |
TTS and Voices |
POST |
/voices/clone |
TTS and Voices |
DELETE |
/voices/{entry_id} |
TTS and Voices |
GET |
/voice-uploads/{token} |
TTS and Voices |
Typical request sequence¶
A complete client interaction typically follows the sequence below.
sequenceDiagram
participant C as Client
participant API as API
C->>API: GET /models
API-->>C: { models: [...] }
C->>API: GET /avatars
API-->>C: [ AvatarSummary, ... ]
C->>API: POST /sessions
API-->>C: { session_id, status: "created" }
C->>API: POST /sessions/{id}/webrtc/offer
API-->>C: { sdp: <answer>, type: "answer" }
C->>API: GET /sessions/{id}/events (SSE)
note over C,API: stream remains open
C->>API: POST /sessions/{id}/chat
API-->>C: 200 (events delivered via SSE)
C->>API: DELETE /sessions/{id}
API-->>C: 200
OpenAPI specification¶
A complete OpenAPI 3.x specification is generated by FastAPI and is available at the following endpoints:
- Interactive documentation (Swagger UI):
<base>/docs - Alternative interactive documentation (ReDoc):
<base>/redoc - Raw specification (JSON):
<base>/openapi.json
The OpenAPI specification is authoritative for exact field-level types and validation rules. Generated client SDKs and integration tests should source their definitions from the specification rather than this documentation.
Source files¶
| File | Routes |
|---|---|
apps/api/routes/health.py |
/health, /healthz, /queue/status |
apps/api/routes/models.py |
/models |
apps/api/routes/avatars.py |
/avatars/* |
apps/api/routes/sessions.py |
/sessions/* |
apps/api/routes/tts_preview.py |
/tts/preview |
apps/api/routes/voices.py |
/voices/*, /voice-uploads/{token} |
apps/api/routes/events.py |
/sessions/{id}/events (SSE) |
apps/api/schemas/session.py |
Request and response models for sessions. |
apps/api/schemas/avatar.py |
AvatarSummary response model. |