Loading…
Onwinds is an OpenAI-compatible gateway for teams and production systems. Your organisation controls models, routing, governance, and spend centrally, and your users and your apps share the same control plane.
Onwinds exposes three public endpoints. They are OpenAI-compatible, so existing SDKs work with a base URL swap.
returns the tenant’s logical models.
supports streaming and enforces policy + spend before tokens flow.
routes embeddings through your allowlist and bills deterministically.
Use your Onwinds API key with a standard OpenAI client. Set base URL to .
Governance is enforced per request. Use these headers to require EU-only routing and a stricter retention posture.
Set to to require an EU-compliant route. If no compliant route exists, the request does not run (fail-closed).
Set to to request Zero Data Retention for Onwinds logging paths while preserving billing metadata.
curl https://api.onwinds.com/v1/chat/completions \
-H "Authorization: Bearer ow_…" \
-H "Content-Type: application/json" \
-H "x-onwinds-region-required: eu" \
-H "x-onwinds-retention-mode: zdr" \
-d '{"model":"gpt-5-mini","stream":true,"max_tokens":256,"messages":[{"role":"user","content":"Write a concise incident update."}]}'Embeddings are tenant scoped. Use the embeddings endpoint with the logical model `onwinds/embeddings`.
List the logical models your tenant has enabled and allowlisted.
Requests are always executed in a tenant context. API keys are tenant-scoped and the gateway derives context server-side; clients do not provide tenant IDs in the payload.
Users sign in to the workspace and the platform applies organisation policy (allowed models, governance, and spend) automatically.
Apps use an API key on the OpenAI-compatible endpoints. Each request is attributed and billed deterministically.
Onwinds returns structured errors and attaches request identifiers to help you debug failures deterministically.
On failure, you’ll receive a machine-readable and (when available) a .
Responses include request identifiers (for example via / ) so incidents are debuggable without guesswork.