Welcome to OmniRT¶
OmniRT is a unified runtime for image, video, and audio-driven avatar models. Every task face speaks the same GenerateRequest / GenerateResult / RunReport contract, shares the same CLI and Python API, runs the same request-validation flow, and plugs into a pluggable hardware backend.
Where you start depends on what you want to do:
🚀 Run a model
The shortest path from install to a working text2image request.
📘 Build an application CLI / Python API, presets, service schema, deployment guides.
OmniRT is stable with¶
- One request contract —
GenerateRequest/GenerateResult/RunReportcover every public task face - Backend-neutral runtime — the same request validates and runs on
cuda,ascend, andcpu-stub - Clear task surfaces —
text2image,image2image,text2video,image2video,audio2videoare all public APIs - Standardized artifacts — images export as
PNG, videos asMP4, every run ships aRunReport - Self-describing models — the registry exposes
min_vram_gb, recommended presets, etc. viaomnirt models - Offline friendly — local model directories, HF repo ids, and single-file weights are all first-class
OmniRT is flexible with¶
- Three entry points — Python API, CLI (
omnirt generate / validate / models), and FastAPI server - 16+ model families — SD1.5 / SDXL / SVD / FLUX / FLUX2 / WAN / AnimateDiff / ChronoEdit / FlashTalk …
- China-region friendly — ModelScope, HF-Mirror, offline snapshots and internal mirrors work out of the box
- Async dispatch —
queue/worker/policiesfor batched requests and multi-model queues - Pluggable telemetry —
middleware.telemetryplugs into your observability stack - Safe defaults —
--dry-runandvalidatecatch misconfigurations before you burn GPU time
Model map¶
OmniRT supports model families spanning:
- Image generation — SD1.5, SD2.1, SDXL, SD3, FLUX, FLUX2, Qwen-Image
- Video generation — SVD, SVD-XT, AnimateDiff-SDXL, WAN 2.2 T2V/I2V, CogVideoX, Hunyuan-Video, LTX2, ChronoEdit
- Avatar generation — SoulX-FlashTalk
- Generalist image editing — Generalist Image family
See the full registry at Supported Models or run omnirt models locally.
Public task surfaces today¶
| Task | Inputs | Output | Representative models |
|---|---|---|---|
text2image |
prompt | PNG | sd15, sdxl-base-1.0, flux2.dev, qwen-image |
image2image |
prompt + image | PNG | sd15, sd21, sdxl-base-1.0, sdxl-refiner-1.0 |
text2video |
prompt | MP4 | wan2.2-t2v-14b, cogvideox-2b, hunyuan-video |
image2video |
prompt + first-frame | MP4 | svd, svd-xt, wan2.2-i2v-14b, ltx2-i2v |
audio2video |
audio + portrait | MP4 | soulx-flashtalk-14b |
Stable boundary
inpaint, edit, and video2video have runtime plumbing in place but are still evolving as public task surfaces. See support status.
Dig deeper¶
- Roadmap — what we plan to support next
- Architecture — how the interface, engine, executors, and telemetry layers fit together
- Domestic deployment — ModelScope / HF-Mirror / offline snapshots