TsgcAIOpenAIChatBot
To build a ChatBot with voice commands, the following steps are required:
- The Microphone Audio must be captured, so a speech to text system is needed to get the text that will be sent to OpenAI.
- Capturing the Microphone Audio is done using the component TsgcAudioRecorderMCI.
- Once we've captured the audio, this is sent to the OpenAI whisper api to convert the audio file to text.
- Once we get the speech to text, now we send the text to OpenAI using the ChatCompletion API.
- The response from OpenAI must be converted now to Speech using one of the following components:
- TsgcTextToSpeechSystem: (currently only for Windows) uses the Windows Speech To Text from Operating System.
- TsgcTextToSpeechGoogle: sends the response from OpenAI to the Google Cloud Servers and an mp3 file is returned which is played by the TsgcAudioPlayerMCI.
- TsgcTextToSpeechAmazon: ends the response from OpenAI to the Amazon AWS Servers and an mp3 file is returned which is played by the TsgcAudioPlayerMCI.
Properties
- OpenAIOptions: configure here the OpenAI properties.
- ApiKey: an API key is required to interactuate with the OpenAI APIs.
- LogOptions
- Enabled: if set to true, the API requests will be log into a text file.
- FileName: the filename of the log.
- Organization: an optional OpenAI API field.
- ChatBotOptions: configure here the ChatBot properties.
- Transcription: configure here the OpenAI Transcription API settings.
- Model: by default whisper-1
- Language: the language code of the transcription (helps the model to transcribe better the speech to text).
- Chatcompletion: configure here the OpenAI ChatCompletion API settings.
- Model: by default gpt-3.5-turbo.
- AudioRecorder: assign a TsgcAudioRecorder component to capture the microphone audio.
- TextToSpeech: assign a TsgcTextToSpeech component to listen the response from OpenAI.
Events
- OnAudioStart: the event is called when the Audio Starts to being recorded.
- OnAudioStop: the event is called after the Audio Stops Recording.
- OnTranscription: the event is called when receiving a response from OpenAI Transcription API with the Speech-To-Text result.
- OnChatCompletion: the event is called when receiving a response from the OpenAI ChatCompletion API with the Content text.
Code Example
Create a new ChatBot, using the default Text-To-Speech from Microsoft Windows. Use Start to Start the recording of the audio and Stop to Stop the recording and send the audio to the OpenAI API and return a response from ChatGPT.
// ... create the chatbot component
sgcChatBot := TsgcAIOpenAIChatBot.Create(nil);
sgcChatBot.OpenAIOptions.ApiKey := 'your_openapi_api_key';
sgcChatBot.ChatBotOptions.Transcription.Language := 'en';
// ... create audio recorder and tex-to-speech
sgcAudioRecorder := TsgcAudioRecorderMCI.Create(nil);
sgcTextToSpeech := TsgcTextToSpeechSystem.Create(nil);
// ... assign audio components to chatbot
sgcChatBot.AudioRecorder := sgcAudioRecorder;
sgcChatBot.TextToSpeech := sgcTextToSpeech;
// ... start the chatbot, speak with a microphone to capture the audio and stop to process the audio
sgcChatBot.Start;
... speak
sgcChatBot.Stop;