Transcription OpenAI Delphi Client (3 / 5)

· Fonctionnalités

Transcribing Audio to Text (also known as Speech to Text) est very easy utilisant le OpenAI API, just upload an Audio file in one de la following formats: mp3, mp4, mpeg, mpga, m4a, wav, ou webm. Et l'API va renvoyer le string.

Transcription Delphi Exemple

OpenAI requires to build une requête were tu pass le fichier audio, le model, le temperature (to obtenir un more ore less random output... voici une liste de le available parameters.

- Filename: (Obligatoire) Le fichier audio to transcrire, in one de ces formats: mp3, mp4, mpeg, mpga, m4a, wav, ou webm.
- Model: (Obligatoire) ID de la model to use. Only whisper-1 est currently available.
- Prompt: An facultatif text to guide le model's style ou continue a previous audio segment. Le prompt devrait match le audio language.
- ResponseFormat: Le format de la transcript output, in one de ces options: json, text, srt, verbose_json, ou vtt.
- Temperature: Le sampling temperature, entre 0 et 1. Higher values like 0.8 va make le output more random, tandis que lower values like 0.2 va make it more focused et deterministic. Si set to 0, le model va use log probability to automatically increase le temperature until certain thresholds are hit.
- Language: Le language de la input audio. Supplying le input language in ISO-639-1 format va improve accuracy et latency.


Voici un simple exemple transcribing an fichier audio utilisant whisper-1

procedure DoFileTranscription(const aFilename: string);
var
  oRequest: TsgcOpenAIClass_Request_Transcription;
  oResponse: TsgcOpenAIClass_Response_Transcription;
begin
  oRequest := TsgcOpenAIClass_Request_Transcription.Create;
  Try
    oRequest.Filename := aFilename;
    oRequest.Model := 'whisper-1';
    oResponse := OpenAI.CreateTranscriptionFromFile(oRequest);
    Try
      DoLog(oResponse.Text);
    Finally
      oResponse.Free;
    End;
  Finally
    oRequest.Free;
  End;
end; 

Voici le compiled Demo for Windows utilisant the sgcWebSockets OpenAI Delphi Library.