Deployment¶
OmniRT's deployment surface splits along two axes: hardware backend and network environment.
| Scenario | Start here |
|---|---|
| NVIDIA GPU production | CUDA deployment |
| Ascend Atlas / 910 / 910B | Ascend backend |
| Domestic / intranet / offline | Domestic deployment |
| Containerized (Docker / k8s) | Docker & containers |
| Gateway + workers + Redis / OTLP | Distributed serving |
Validate before deploying
Before touching real hardware, run omnirt validate and omnirt generate --dry-run against --backend cpu-stub to confirm your request contract and model registry. See Quickstart.
If your target deployment needs async jobs, cross-process job state, Prometheus scraping, or remote workers, continue with Distributed serving.