Deployment¶

OmniRT's deployment surface splits along two axes: hardware backend and network environment.

Scenario	Start here
NVIDIA GPU production	CUDA deployment
Ascend Atlas / 910 / 910B	Ascend backend
Domestic / intranet / offline	Domestic deployment
Containerized (Docker / k8s)	Docker & containers
Gateway + workers + Redis / OTLP	Distributed serving

Validate before deploying

Before touching real hardware, run omnirt validate and omnirt generate --dry-run against --backend cpu-stub to confirm your request contract and model registry. See Quickstart.

If your target deployment needs async jobs, cross-process job state, Prometheus scraping, or remote workers, continue with Distributed serving.