Skip to content

Audio

This section provides endpoints to transcribe and translate audio to text, and to synthesize speech from text.


POST https://api.aifoundryhub.com/v1/audio/transcriptions

Transcribes audio into text using a speech‑to‑text model.

Terminal window
curl -X POST "https://api.aifoundryhub.com/v1/audio/transcriptions" \
-H "Authorization: Bearer $AI_FOUNDRY_HUB_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F model=whisper-large-v3 \
-F file=@/path/to/audio.mp3

A transcription object.

{ "text": "Hello world." }

POST https://api.aifoundryhub.com/v1/audio/translations

Translates non‑English speech to English text.

Terminal window
curl -X POST "https://api.aifoundryhub.com/v1/audio/translations" \
-H "Authorization: Bearer $AI_FOUNDRY_HUB_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F model=whisper-large-v3 \
-F file=@/path/to/audio.m4a

A translation object.

{ "text": "Hello world." }

POST https://api.aifoundryhub.com/v1/audio/speech

Generates spoken audio from text using a text‑to‑speech model.

Terminal window
curl -X POST "https://api.aifoundryhub.com/v1/audio/speech" \
-H "Authorization: Bearer $AI_FOUNDRY_HUB_API_KEY" \
-H "Content-Type: application/json" \
-o speech.mp3 \
-d '{
"model": "tts-1",
"voice": "alloy",
"input": "Hello! This is a test.",
"format": "mp3"
}'

Binary audio in the requested container. The Content-Type will match the chosen format.