GPT Audio

Developer documentation

GPT Audio for speech, transcription, translation, or voice generation workflows.

Model Reference

Audio and speech models

Text-to-speech, GPT audio chat, realtime sessions, transcription-oriented catalog rows, and voice settings. Endpoint: https://www.omixa.cloud/api/v1/audio

GPT Audio

gpt-audio

GPT Audio for speech, transcription, translation, or voice generation workflows.

Audio Context window: 128,000 tokens Max output: 16,384 tokens

input per 1m tokens $2.500000

output per 1m tokens $10.000000

minimum hold $0.010000

Integration reference

Connect GPT Audio

Use Omixa's unified endpoint and your workspace API key. Provider routing, billing, failover, and usage records are handled by Omixa.

POST https://www.omixa.cloud/api/v1/audio

Provider: azure-openai
Endpoint type: azure_openai_audio
Context window: 128,000 tokens
Max output: 16,384 tokens

Request schema

Request fields

Only send options supported by this model. Required fields and accepted values are listed below.

Field	Type	Required	Accepted values	Description
`model`	string	Yes	gpt-audio	Use `gpt-audio`. Omixa resolves the active provider route and failover key automatically.
`task`	string	No	auto, tts, audio_chat, realtime, transcription, translation	Audio workflow. Omixa maps it to speech, GPT audio chat, or realtime setup.
`input`	string	Yes	Any valid value	Text to speak, text prompt for audio chat, or realtime session instruction.
`voice`	string	No	alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar	Voice name. Omixa maps common OpenAI voices to Google Gemini TTS voices when needed.
`response_format`	string	No	mp3, wav, opus, flac, pcm16, aac	Returned audio format.
`speed`	number	No	0.25-4	Speech speed for TTS routes.
`instructions`	string	No	Any valid value	Voice style or realtime/audio-chat behavior instructions.

Ready to send

Payload and response

Start with this model-safe payload and expect the normalized Omixa response shape shown beside it.

Example JSON payload

{
    "model": "gpt-audio",
    "task": "audio_chat",
    "input": "Create a short spoken answer explaining the benefits of one endpoint.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1
}

Response shape

{
    "object": "audio.speech|audio.chat|realtime.session",
    "data": {
        "content_type": "audio/mpeg",
        "audio_base64": "<base64-audio>"
    },
    "usage": {
        "estimated_audio_minutes": "0.100000"
    }
}

Language examples

Copy-ready integration code

Replace the example API key with a workspace key and keep model-specific fields unchanged unless the table above marks them optional.

cURL

curl -X POST https://www.omixa.cloud/api/v1/audio \
  -H "Authorization: Bearer omx_live_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-audio",
    "task": "audio_chat",
    "input": "Create a short spoken answer explaining the benefits of one endpoint.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1
}'

JavaScript fetch

const response = await fetch('https://www.omixa.cloud/api/v1/audio', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer omx_live_xxx',
    'Content-Type': 'application/json'
  },
  body: "{\n    \"model\": \"gpt-audio\",\n    \"task\": \"audio_chat\",\n    \"input\": \"Create a short spoken answer explaining the benefits of one endpoint.\",\n    \"voice\": \"alloy\",\n    \"response_format\": \"mp3\",\n    \"speed\": 1\n}"
});
const data = await response.json();

Python requests

import requests

response = requests.post(
    'https://www.omixa.cloud/api/v1/audio',
    headers={'Authorization': 'Bearer omx_live_xxx', 'Content-Type': 'application/json'},
    json={
    "model": "gpt-audio",
    "task": "audio_chat",
    "input": "Create a short spoken answer explaining the benefits of one endpoint.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1
}
)
print(response.json())

PHP cURL

$ch = curl_init('https://www.omixa.cloud/api/v1/audio');
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_HTTPHEADER => ['Authorization: Bearer omx_live_xxx', 'Content-Type: application/json'],
    CURLOPT_POSTFIELDS => '{
    "model": "gpt-audio",
    "task": "audio_chat",
    "input": "Create a short spoken answer explaining the benefits of one endpoint.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1
}',
    CURLOPT_RETURNTRANSFER => true,
]);
$response = curl_exec($ch);

C# HttpClient

using var client = new HttpClient();
client.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Bearer", "omx_live_xxx");
var json = @"{
    ""model"": ""gpt-audio"",
    ""task"": ""audio_chat"",
    ""input"": ""Create a short spoken answer explaining the benefits of one endpoint."",
    ""voice"": ""alloy"",
    ""response_format"": ""mp3"",
    ""speed"": 1
}";
var response = await client.PostAsync("https://www.omixa.cloud/api/v1/audio", new StringContent(json, System.Text.Encoding.UTF8, "application/json"));
var body = await response.Content.ReadAsStringAsync();

Go net/http

payload := []byte(`{
    "model": "gpt-audio",
    "task": "audio_chat",
    "input": "Create a short spoken answer explaining the benefits of one endpoint.",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1
}`)
req, _ := http.NewRequest("POST", "https://www.omixa.cloud/api/v1/audio", bytes.NewReader(payload))
req.Header.Set("Authorization", "Bearer omx_live_xxx")
req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req)

Production checklist

Operational notes

Authenticate with `Authorization: Bearer omx_live_xxx`.
Omixa handles provider keys, routing, billing, failover, and usage recording behind this endpoint.