TsgcAIOpenAIChatBot
To build a ChatBot with voice commands, the following steps are required:
- The Microphone Audio must be captured, so a speech to text system is needed to get the text that will be sent to OpenAI.
- Capturing the Microphone Audio is done using the component TsgcAudioRecorderMCI.
- Once we've captured the audio, this is sent to the OpenAI whisper api to convert the audio file to text.
- Once we get the speech to text, now we send the text to OpenAI using the ChatCompletion API.
- The response from OpenAI must be converted now to Speech using one of the following components:
- TsgcTextToSpeechSystem: (currently only for Windows) uses the Windows Speech To Text from Operating System.
- TsgcTextToSpeechGoogle: sends the response from OpenAI to the Google Cloud Servers and an mp3 file is returned which is played by the TsgcAudioPlayerMCI.
- TsgcTextToSpeechAmazon: ends the response from OpenAI to the Amazon AWS Servers and an mp3 file is returned which is played by the TsgcAudioPlayerMCI.
Properties
- OpenAIOptions: configure here the OpenAI properties.
- ApiKey: an API key is required to interactuate with the OpenAI APIs.
- LogOptions
- Enabled: if set to true, the API requests will be log into a text file.
- FileName: the filename of the log.
- Organization: an optional OpenAI API field.
- ChatBotOptions: configure here the ChatBot properties.
- Transcription: configure here the OpenAI Transcription API settings.
- Model: by default whisper-1
- Language: the language code of the transcription (helps the model to transcribe better the speech to text).
- Chatcompletion: configure here the OpenAI ChatCompletion API settings.
- Model: by default gpt-3.5-turbo.
- AudioRecorder: assign a TsgcAudioRecorder component to capture the microphone audio.
- TextToSpeech: assign a TsgcTextToSpeech component to listen the response from OpenAI.
Events
- OnAudioStart: the event is called when the Audio Starts to being recorded.
- OnAudioStop: the event is called after the Audio Stops Recording.
- OnTranscription: the event is called when receiving a response from OpenAI Transcription API with the Speech-To-Text result.
- OnChatCompletion: the event is called when receiving a response from the OpenAI ChatCompletion API with the Content text.
Code Example
Create a new ChatBot, using the default Text-To-Speech from Microsoft Windows. Use Start to Start the recording of the audio and Stop to Stop the recording and send the audio to the OpenAI API and return a response from ChatGPT.
// ... create the chatbot component
TsgcAIOpenAIChatBot *sgcChatBot = new TsgcAIOpenAIChatBot(NULL);
sgcChatBot->OpenAIOptions->ApiKey = "your_openapi_api_key";
sgcChatBot->ChatBotOptions->Transcription->Language = "en";
// ... create audio recorder and text-to-speech
TsgcAudioRecorderMCI *sgcAudioRecorder = new TsgcAudioRecorderMCI(NULL);
TsgcTextToSpeechSystem *sgcTextToSpeech = new TsgcTextToSpeechSystem(NULL);
// ... assign audio components to chatbot
sgcChatBot->AudioRecorder = sgcAudioRecorder;
sgcChatBot->TextToSpeech = sgcTextToSpeech;
// ... start the chatbot, speak with a microphone to capture the audio, and stop to process the audio
sgcChatBot->Start();
// ... speak
sgcChatBot->Stop();