Build an AI Chatbot in Delphi: Conversation Memory and Streaming

· Components

Quick answer: a chatbot is a chat completion plus memory. sgcWebSockets ships a dedicated TsgcAIChat component that keeps the full message history for you, so every turn is sent with the conversation so far and the model can answer in context. Drop the component, set the API key and model, then call Chat for a reply or ChatStream for a live, token-by-token typing effect. Clearing the conversation is one call to ClearHistory.

If you have already called an LLM from Delphi, you have probably noticed something frustrating: ask a follow-up question and the model acts like it has never spoken to you before. That is not a bug, it is how a one-shot completion works. Turning that into a real conversational assistant is the difference between a toy and a chatbot, and it comes down to one idea: memory.

One-shot completion vs a chatbot

An LLM API is stateless. Each request is independent, so the model only knows what you put in that request. A one-shot call sends a single prompt:

// User: "What is the capital of France?"  ->  "Paris."
// User: "And its population?"             ->  "Population of what?"

The second question fails because the server kept nothing from the first. A chatbot fixes this by resending the whole exchange every time: the system message, every user turn, and every assistant reply, in order. The model reads the history, sees that "its" refers to Paris, and answers correctly. You do not need a database or a session server for this, just a message list that grows with the conversation. The only real work is building and maintaining that list, and that is exactly what the chatbot component does for you.

The TsgcAIChat component: memory handled for you

Rather than wiring up a message array by hand, drop a TsgcAIChat onto your form. It owns the conversation history internally, appends each user message and each assistant reply automatically, and sends the accumulated context on every call. You set the provider, the API key and a model, then just call Chat.

uses
  sgcAI, sgcAI_Chat;

var
  Bot: TsgcAIChat;
begin
  Bot := TsgcAIChat.Create(nil);
  Bot.Provider := aicpOpenAI;
  Bot.ChatOptions.ApiKey := 'sk-...';
  Bot.ChatOptions.Model  := 'gpt-4o-mini';
  Bot.SystemMessage := 'You are a concise assistant for Delphi developers.';

  // Each call adds to the same conversation:
  ShowMessage(Bot.Chat('What is the capital of France?'));  // "Paris."
  ShowMessage(Bot.Chat('And its population?'));             // answers about Paris
end;

Because the component remembers the first turn, the follow-up just works. The SystemMessage sets the assistant's persona and is included with every request. When you want a fresh conversation, call Bot.ClearHistory; to inspect or persist what has been said, Bot.GetHistory returns the message list. You can also cap memory with MaxHistoryMessages so a long chat does not grow unbounded (older turns are pruned automatically).

The same component talks to every provider sgcWebSockets supports. Switch Provider to aicpAnthropic, aicpGemini, aicpDeepSeek, aicpOllama, aicpGrok or aicpMistral, change the model name, and the rest of your chatbot code stays identical. See the ChatBot component page and the AI & LLM components hub.

Streaming a live reply

Waiting several seconds for a full answer to appear feels slow. Real chatbots stream the reply so words show up as they are generated, the familiar typing effect. TsgcAIChat exposes this through ChatStream plus the OnChatStream event, which fires for each chunk of text as it arrives.

Bot.OnChatStream := BotChatStream;
Bot.ChatStream('Explain WebSockets in two sentences.');

procedure TForm1.BotChatStream(Sender: TObject; const aChunk: string;
  var Cancel: Boolean);
begin
  Memo1.Text := Memo1.Text + aChunk;  // append each token as it arrives
  // set Cancel := True to stop the response early
end;

The chunks are delivered incrementally over Server-Sent Events under the hood, but you never touch the SSE plumbing. When the stream finishes, the complete assistant reply is added to the history just like a non-streamed call, so the next turn still has full context. The Cancel parameter lets you implement a "stop generating" button. There is also OnChatMessage for the final assembled message and OnChatError to surface any API failures.

A voice chatbot, end to end

If you want the assistant to listen and talk, the TsgcAIOpenAIChatBot component wraps the whole loop: it captures microphone audio, transcribes it with Whisper, sends the text to Chat Completions, and speaks the answer back through a text-to-speech provider. Plug in an audio recorder and a text-to-speech engine, set the key, and call Start.

uses
  sgcAI, sgcAI_OpenAI_Audio_ChatBot,
  sgcAI_AudioRecorder_MCI, sgcAI_TextToSpeech_System;

var
  ChatBot: TsgcAIOpenAIChatBot;
begin
  ChatBot := TsgcAIOpenAIChatBot.Create(nil);
  ChatBot.OpenAIOptions.ApiKey := 'sk-...';
  ChatBot.AudioRecorder := TsgcAudioRecorderMCI.Create(nil);
  ChatBot.TextToSpeech  := TsgcTextToSpeechSystem.Create(nil);
  ChatBot.OnChatCompletion := ChatBotChatCompletion;

  ChatBot.Start;                          // begin listening; Stop ends it
  ChatBot.ChatAsUser('Tell me a joke');   // or push a turn programmatically
end;

The OnChatCompletion event gives you the role and content of each reply, and OnTranscription lets you inspect or edit what was heard before it is sent. It is the same conversational idea as TsgcAIChat, just with audio on both ends.

Prefer to manage the list yourself?

You do not have to use the chatbot component. If you want full control, keep your own list of { role, content } messages and send it on each call with TsgcHTTP_API_OpenAI._CreateChatCompletion. Append the user message, send the array, then append the assistant's reply back into the same list before the next turn. That is precisely the bookkeeping TsgcAIChat does internally, so most people let the component handle it. The lower-level API is covered in the OpenAI API in Delphi tutorial and on the OpenAI component page.

Getting started

Everything here ships in sgcWebSockets. Grab the free trial, drop a TsgcAIChat on a form, set the API key and model, and you will have a context-aware chatbot answering follow-up questions in a few lines. Add ChatStream for the live typing effect when you are ready.

Questions, feedback or help wiring it into your app? Get in touch — you will get a reply from the people who wrote the code.