Omixa API Docs
OpenAI-compatible resale API backed by Azure deployments, wallet billing, streaming responses, token accounting, and detailed usage monitoring.
Overview
Base URLAll developer endpoints are versioned under /api/v1.
https://www.omixa.cloud/api/v1
Chat, embeddings, and images use OpenAI-style request bodies while Omixa handles billing and Azure key routing.
Authentication
BearerUse your developer API key as a bearer token. Optional HMAC signatures can be enforced with OMIXA_REQUIRE_SIGNATURE=true.
Authorization: Bearer omx_live_xxx Content-Type: application/json X-Omixa-Timestamp: 1780660800 X-Omixa-Signature: sha256=<hmac_sha256(timestamp.method./api/v1/path.raw_body)>
Models Catalog
Pricing| Method | Path | Purpose |
|---|---|---|
| GET | /api/v1/models | List enabled models with modality, provider, and prices. |
| GET | /catalog | Visual developer catalog inside the dashboard. |
Chat Completions
POSTcurl -X POST https://www.omixa.cloud/api/v1/chat/completions \
-H "Authorization: Bearer omx_live_xxx" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
Streaming
SSEcurl -N -X POST https://www.omixa.cloud/api/v1/chat/completions \
-H "Authorization: Bearer omx_live_xxx" \
-H "Accept: text/event-stream" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Write a short test."}]}'
When stream is true, Omixa returns OpenAI-compatible SSE frames as they arrive. Azure frames are passed through directly, while Vertex Gemini uses native streamGenerateContent and is normalized into chat.completion.chunk events with the final data: [DONE] marker.
For the lowest first-token latency, keep AZURE_STREAM_CHUNK_BYTES=128, AZURE_STREAM_HEARTBEAT_SECONDS=1, and AZURE_STREAM_FLUSH_PADDING_BYTES=8192. Omixa sends SSE comment heartbeats while providers are still thinking, so proxies and browsers do not treat the connection as idle.
In production, disable response buffering, gzip, and HTTP/3/QUIC for /api/v1/chat/completions and /playground/chat/completions at the CDN/proxy layer. The Laravel response also sets X-Accel-Buffering: no, Cache-Control: no-transform, and Alt-Svc: clear.
Reasoning speed is normalized per Azure model family before the request reaches Azure. Original gpt-5 models support minimal, gpt-5.1 supports none, and newer models such as gpt-5.4 and gpt-5.5 support none through xhigh. gpt-5-pro only supports high.
{
"model": "gpt-5.5",
"stream": true,
"reasoning_effort": "xhigh",
"verbosity": "medium",
"service_tier": "priority"
}
service_tier: "priority" is Azure Priority processing. Enable Priority processing on eligible Global Standard or Data Zone Standard deployments in Microsoft Foundry, or send service_tier per request. If the deployment does not support Priority processing, turn off Fast mode and Omixa will omit the priority request.
Embeddings
POSTcurl -X POST https://www.omixa.cloud/api/v1/embeddings \
-H "Authorization: Bearer omx_live_xxx" \
-H "Content-Type: application/json" \
-d '{"model":"text-embedding-3-small","input":"Knowledge base text"}'
Image Generation
POSTcurl -X POST https://www.omixa.cloud/api/v1/images/generations \
-H "Authorization: Bearer omx_live_xxx" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-image-1","prompt":"A black and white futuristic API dashboard","n":1}'
Wallet and Billing
Ledger| Method | Path | Purpose |
|---|---|---|
| GET | /api/v1/wallet | Current balance and locked balance. |
| POST | /api/v1/wallet/top-up | Manual/demo top-up endpoint. |
| GET | /api/v1/wallet/transactions | Full wallet movement history. |
Usage Reports
Live| Method | Path | Purpose |
|---|---|---|
| GET | /api/v1/usage/requests | Per-request logs, tokens, cost, and latency. |
| GET | /api/v1/usage/rollups | Aggregated hourly and daily usage. |
| GET | /usage | Visual dashboard reports and live polling. |
Errors
JSON{
"error": {
"code": "insufficient_balance",
"message": "Wallet balance is not enough for this request."
}
}
Wallet balance is not enough.
Developer key exceeded rate limits.
No healthy Azure deployment available.
Upstream or internal processing failure.
Azure Deployments
AdminAdd many deployments for the same model to distribute load across Azure resources, regions, and keys. Omixa checks RPM, TPM, and hourly token windows before routing.
php artisan omixa:add-azure-deployment gpt-4o-mini eastus-key-1 \ https://YOUR_RESOURCE.openai.azure.com gpt-4o-mini AZURE_KEY_HERE \ --region=eastus --rpm=120 --tpm=200000 --hourly-tokens=1000000