Omixa API Documentation

Overview

Base URL

Base path

All developer endpoints are versioned under /api/v1.

https://www.omixa.cloud/api/v1

Compatibility

Chat, embeddings, and images use OpenAI-style request bodies while Omixa handles billing and Azure key routing.

Authentication

Bearer

Use your developer API key as a bearer token. Optional HMAC signatures can be enforced with OMIXA_REQUIRE_SIGNATURE=true.

Authorization: Bearer omx_live_xxx
Content-Type: application/json
X-Omixa-Timestamp: 1780660800
X-Omixa-Signature: sha256=<hmac_sha256(timestamp.method./api/v1/path.raw_body)>

Models Catalog

Pricing

Method	Path	Purpose
GET	`/api/v1/models`	List enabled models with modality, provider, and prices.
GET	`/catalog`	Visual developer catalog inside the dashboard.

Chat Completions

POST

curl -X POST https://www.omixa.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'

Streaming

SSE

curl -N -X POST https://www.omixa.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","stream":true,"messages":[{"role":"user","content":"Write a short test."}]}'

When stream is true, Omixa returns OpenAI-compatible SSE frames as they arrive. Azure frames are passed through directly, while Vertex Gemini uses native streamGenerateContent and is normalized into chat.completion.chunk events with the final data: [DONE] marker.

For the lowest first-token latency, keep AZURE_STREAM_CHUNK_BYTES=128, AZURE_STREAM_HEARTBEAT_SECONDS=1, and AZURE_STREAM_FLUSH_PADDING_BYTES=8192. Omixa sends SSE comment heartbeats while providers are still thinking, so proxies and browsers do not treat the connection as idle.

In production, disable response buffering, gzip, and HTTP/3/QUIC for /api/v1/chat/completions and /playground/chat/completions at the CDN/proxy layer. The Laravel response also sets X-Accel-Buffering: no, Cache-Control: no-transform, and Alt-Svc: clear.

Reasoning speed is normalized per Azure model family before the request reaches Azure. Original gpt-5 models support minimal, gpt-5.1 supports none, and newer models such as gpt-5.4 and gpt-5.5 support none through xhigh. gpt-5-pro only supports high.

{
  "model": "gpt-5.5",
  "stream": true,
  "reasoning_effort": "xhigh",
  "verbosity": "medium",
  "service_tier": "priority"
}

service_tier: "priority" is Azure Priority processing. Enable Priority processing on eligible Global Standard or Data Zone Standard deployments in Microsoft Foundry, or send service_tier per request. If the deployment does not support Priority processing, turn off Fast mode and Omixa will omit the priority request.

Embeddings

POST

curl -X POST https://www.omixa.cloud/api/v1/embeddings \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-embedding-3-small","input":"Knowledge base text"}'

Image Generation

POST

curl -X POST https://www.omixa.cloud/api/v1/images/generations \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-image-1","prompt":"A black and white futuristic API dashboard","n":1}'

Wallet and Billing

Ledger

Method	Path	Purpose
GET	`/api/v1/wallet`	Current balance and locked balance.
POST	`/api/v1/wallet/top-up`	Manual/demo top-up endpoint.
GET	`/api/v1/wallet/transactions`	Full wallet movement history.

Usage Reports

Live

Method	Path	Purpose
GET	`/api/v1/usage/requests`	Per-request logs, tokens, cost, and latency.
GET	`/api/v1/usage/rollups`	Aggregated hourly and daily usage.
GET	`/usage`	Visual dashboard reports and live polling.

Errors

JSON

{
  "error": {
    "code": "insufficient_balance",
    "message": "Wallet balance is not enough for this request."
  }
}

402

Wallet balance is not enough.

429

Developer key exceeded rate limits.

503

No healthy Azure deployment available.

5xx

Upstream or internal processing failure.

Azure Deployments

Admin

Add many deployments for the same model to distribute load across Azure resources, regions, and keys. Omixa checks RPM, TPM, and hourly token windows before routing.

php artisan omixa:add-azure-deployment gpt-4o-mini eastus-key-1 \
  https://YOUR_RESOURCE.openai.azure.com gpt-4o-mini AZURE_KEY_HERE \
  --region=eastus --rpm=120 --tpm=200000 --hourly-tokens=1000000

Omixa API Docs