Audio

Discover how to convert audio to text or text to audio. OpenAI compliant.


POSThttp://localhost:33322/v1/audio/transcriptions

Create transcription

Transcribes speech into text.

Required attributes

  • Name
    file
    Type
    file
    Description

    The audio file to be transcribed. Supported file types: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

  • Name
    model
    Type
    string
    Description

    The model used for transcription. WARNING: currently, this attribute is ignored and the default model is used.

Optional attributes

  • Name
    create_session
    Type
    bool
    Description

    If present and true, a new audio session will be created and used for the transcription and the session's UUID is returned in the response object. A session will keep track of past inferences, this may be useful for things like live transcriptions where continuous audio is submitted across several requests.

  • Name
    session
    Type
    UUID
    Description

    The UUID of an existing session, which will be used for the transcription.

Request

POST
v1/audio/transcriptions
curl http://localhost:33322/v1/audio/transcriptions \
  -H "Authorization: Bearer no-key-required" \
  -H "Content-Type: multipart/form-data" \
  -F file="@/path/to/file/audio.mp3" \
  -F model="default"

Response

{
  "text": "The woods are lovely, dark and deep, but I have promises to keep and miles to go before I sleep, and miles to go before I sleep."
}

GEThttp://localhost:33322/v1/audio/transcriptions/status

Transcription status

Shows the current status of the audio transcriptions endpoint (e.g. downloads)

Response attributes

  • Name
    active_model
    Type
    string
    Description

    The model that is currently active for this endpoint.

  • Name
    donwload_ongoing
    Type
    bool
    Description

    The model for this endpoint is currently being downloaded.

  • Name
    donwload_progress
    Type
    number
    Description

    The progress of the ongoing model download in percent.

  • Name
    last_errors
    Type
    string[]
    Description

    Errors that occurred recently on this endpoint.

Request

GET
v1/audio/transcriptions/status
curl http://localhost:33322/v1/audio/transcriptions/status \
  -H "Authorization: Bearer no-key-required"

Response

{"active_model":"ggml-distil-small.en.bin","download_ongoing":false,"download_progress":100,"last_errors":["Custom { kind: PermissionDenied, error: \"verboten\" }]}