Omixa API

Omixa API Docs

OpenAI-compatible resale API backed by Azure deployments, wallet billing, streaming responses, token accounting, and detailed usage monitoring.

Overview

Base URL
Base path

All developer endpoints are versioned under /api/v1.

https://www.omixa.cloud/api/v1
Compatibility

Chat, embeddings, and images use OpenAI-style request bodies while Omixa handles billing and Azure key routing.

Authentication

Bearer

Use your developer API key as a bearer token. Optional HMAC signatures can be enforced with OMIXA_REQUIRE_SIGNATURE=true.

Authorization: Bearer omx_live_xxx
Content-Type: application/json
X-Omixa-Timestamp: 1780660800
X-Omixa-Signature: sha256=<hmac_sha256(timestamp.method./api/v1/path.raw_body)>

Models Catalog

Pricing
MethodPathPurpose
GET/api/v1/modelsList enabled models with modality, provider, and prices.
GET/catalogVisual developer catalog inside the dashboard.

Chat Completions

POST
curl -X POST https://www.omixa.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'

Streaming

SSE
curl -N -X POST https://www.omixa.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Write a short test."}]}'

When stream is true, Omixa returns OpenAI-compatible SSE frames as they arrive. Azure frames are passed through directly, while Vertex Gemini uses native streamGenerateContent and is normalized into chat.completion.chunk events with the final data: [DONE] marker.

For the lowest first-token latency, keep AZURE_STREAM_CHUNK_BYTES=128, AZURE_STREAM_HEARTBEAT_SECONDS=1, and AZURE_STREAM_FLUSH_PADDING_BYTES=8192. Omixa sends SSE comment heartbeats while providers are still thinking, so proxies and browsers do not treat the connection as idle.

In production, disable response buffering, gzip, and HTTP/3/QUIC for /api/v1/chat/completions and /playground/chat/completions at the CDN/proxy layer. The Laravel response also sets X-Accel-Buffering: no, Cache-Control: no-transform, and Alt-Svc: clear.

Reasoning speed is normalized per Azure model family before the request reaches Azure. Original gpt-5 models support minimal, gpt-5.1 supports none, and newer models such as gpt-5.4 and gpt-5.5 support none through xhigh. gpt-5-pro only supports high.

{
  "model": "gpt-5.5",
  "stream": true,
  "reasoning_effort": "xhigh",
  "verbosity": "medium",
  "service_tier": "priority"
}

service_tier: "priority" is Azure Priority processing. Enable Priority processing on eligible Global Standard or Data Zone Standard deployments in Microsoft Foundry, or send service_tier per request. If the deployment does not support Priority processing, turn off Fast mode and Omixa will omit the priority request.

Embeddings

POST
curl -X POST https://www.omixa.cloud/api/v1/embeddings \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","input":"Knowledge base text"}'

Image Generation

POST
curl -X POST https://www.omixa.cloud/api/v1/images/generations \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-image-1","prompt":"A black and white futuristic API dashboard","n":1}'

Wallet and Billing

Ledger
MethodPathPurpose
GET/api/v1/walletCurrent balance and locked balance.
POST/api/v1/wallet/top-upManual/demo top-up endpoint.
GET/api/v1/wallet/transactionsFull wallet movement history.

Usage Reports

Live
MethodPathPurpose
GET/api/v1/usage/requestsPer-request logs, tokens, cost, and latency.
GET/api/v1/usage/rollupsAggregated hourly and daily usage.
GET/usageVisual dashboard reports and live polling.

Errors

JSON
{
  "error": {
    "code": "insufficient_balance",
    "message": "Wallet balance is not enough for this request."
  }
}
402

Wallet balance is not enough.

429

Developer key exceeded rate limits.

503

No healthy Azure deployment available.

5xx

Upstream or internal processing failure.

Azure Deployments

Admin

Add many deployments for the same model to distribute load across Azure resources, regions, and keys. Omixa checks RPM, TPM, and hourly token windows before routing.

php artisan omixa:add-azure-deployment gpt-4o-mini eastus-key-1 \
  https://YOUR_RESOURCE.openai.azure.com gpt-4o-mini AZURE_KEY_HERE \
  --region=eastus --rpm=120 --tpm=200000 --hourly-tokens=1000000