Audio Transcription API
The Audio Transcription service allows for the transcription of audio files and offers optional translation into multiple languages.
Overview
The transcription API accepts audio files and returns accurate text transcriptions, with support for different languages.
This endpoint is OpenAI compatible
Transcription Endpoint
POST /v1/audio/transcriptionsThis endpoint transcribes audio files into text format, with an option for translation into different languages.
Request Parameters
file
file
Yes
The audio file to transcribe (mp3, mp4, wav, m4a, webm)
model
string
No
Model to use (default: "KrosMLingualSTT1.0.0")
language
string
No
Optional language specification
prompt
string
No
Text to guide the transcription
response_format
string
No
Output format (default: "json")
temperature
float
No
Model temperature (default: 0.0)
Request Body
file (file, required): The audio file to be transcribed. Various audio formats are supported.
Response
choices (array):
text (string): The resulting transcribed and optionally translated text.
index (integer): The array index of the transcription choice.
finish_reason (string): Explanation of why the transcription process concluded.
model (string): Identifies the transcription/translation model used.
object (string): Specifies the type of response object.
Example Request
Example Response
Best Practices
Transcription
Use high-quality audio for better results
Specify the language when known for improved accuracy
Keep background noise to a minimum
Error Handling
All APIs return standard HTTP status codes:
200: Success
400: Bad request (check parameters)
401: Unauthorized (check API key)
429: Rate limit exceeded
500: Server error
Error responses include a detail field with more information about the error.
Last updated