Skip to content

Welcome to OmniRT

OmniRT OmniRT

Digital-human multimodal runtime with deployable Ascend / 910B adaptation and CUDA-compatible paths.

GitHub stars License PyPI

OmniRT is a unified generation runtime for digital-human pipelines. Voice generation, audio-driven avatars, avatar assets, idle video material, and post-processing share the same GenerateRequest / GenerateResult / RunReport contract, CLI / Python API, request validation flow, and hardware backend abstraction.

General image and video models that are already integrated remain available, but the project no longer grows by model count. The main line is a deployable, reproducible, benchmarkable digital-human vertical loop. Ascend / 910B is the priority path for private deployment adaptation, while CUDA remains the mainstream development, validation, and compatibility backend.

Where you start depends on what you want to do:

OmniRT is stable with

  • Clear digital-human line — TTS, talking avatars, avatar assets, idle video, and post-processing are the highest-priority path
  • Reproducible Ascend / 910B path — runtime profiles, resident workers, real-hardware smoke tests, benchmarks, and deployment notes move together
  • One request contractGenerateRequest / GenerateResult / RunReport cover batch generation surfaces
  • Backend-neutral runtime — the same request validates and runs on ascend, cuda, and cpu-stub; CUDA stays the mainstream compatibility path
  • Clear task surfacestext2audio, audio2video, and asset / material generation share the same API shape
  • Standardized artifacts — images export as PNG, audio as WAV, videos as MP4, every run ships a RunReport
  • Self-describing models — the registry exposes min_vram_gb, recommended presets, etc. via omnirt models
  • Offline friendly — local model directories, HF repo ids, and single-file weights are all first-class

OmniRT is flexible with

  • Three entry points — Python API, CLI (omnirt generate / validate / models), and FastAPI server
  • Focused core models — FlashTalk / FlashHead / LiveAct / CosyVoice / SenseVoice / SoulX-Podcast are the current validation line
  • China-region friendly — ModelScope, HF-Mirror, offline snapshots and internal mirrors work out of the box
  • Async dispatchqueue / worker / policies for batched requests and multi-model queues
  • Pluggable telemetrymiddleware.telemetry plugs into your observability stack
  • Safe defaults--dry-run and validate catch misconfigurations before you burn GPU time

Model Maintenance Boundary

OmniRT now maintains models in three tiers:

  • Core: the digital-human path. Requires real smoke, benchmarks, and deployment docs, for example soulx-flashtalk-14b, soulx-liveact-14b, soulx-flashhead-1.3b, cosyvoice3-triton-trtllm, sensevoice-small, and soulx-podcast-1.7b.
  • Adjacent: avatar assets, backgrounds, idle video, and other digital-human production inputs, for example sdxl-base-1.0, flux2.dev, qwen-image, svd-xt, and wan2.2-*.
  • Experimental: existing general image / video integrations that are no longer headline promises. They keep registry entries, basic tests, and opportunistic maintenance.

See the full registry at Supported Models, and the digital-human priority boundary at Support Status.

Public task surfaces today

Task Inputs Output Representative models
text2image prompt PNG sdxl-base-1.0, flux2.dev, qwen-image
image2image prompt + image PNG sdxl-base-1.0, sdxl-refiner-1.0
text2audio prompt WAV cosyvoice3-triton-trtllm, indextts, soulx-podcast-1.7b
audio2text audio TXT sensevoice-small
text2video prompt MP4 wan2.2-t2v-14b, animate-diff-sdxl
image2video prompt + first-frame MP4 svd, svd-xt, wan2.2-i2v-14b
audio2video audio + portrait MP4 soulx-flashtalk-14b, soulx-flashhead-1.3b, soulx-liveact-14b

Stable boundary

inpaint, edit, and video2video have runtime plumbing in place but are still evolving as public task surfaces. See support status.

Dig deeper

  • Roadmap — digital-human priorities and general-model contraction boundaries
  • Architecture — how the interface, engine, executors, and telemetry layers fit together
  • Domestic deployment — ModelScope / HF-Mirror / offline snapshots