Ollama rende facile eseguire modelli linguistici di grandi dimensioni localmente sul tuo hardware: niente dipendenze cloud, niente costi API e piena privacy dei dati. Per gli sviluppatori Delphi che vogliono integrare capacità AI locali nelle loro applicazioni, sgcWebSockets fornisce TsgcHTTP_API_Ollama: un componente nativo che incapsula l'intera API Ollama con codice Delphi pulito e type-safe.
Che tu debba mantenere dati sensibili on-premises, costruire funzionalità AI offline-capable, gestire la tua libreria di modelli o generare embeddings per la ricerca vettoriale locale, questo componente ti dà accesso diretto a ogni funzionalità di Ollama. Nessun account cloud. Nessuna spesa API ricorrente. Trascina il componente, puntalo alla tua istanza Ollama e inizia a costruire.
Copertura API completa
Ogni funzionalità principale dell'API Ollama è supportata out of the box.
|
Chat Completions Send messages with system prompts, receive responses synchronously or streamed. Full control over temperature, top-p, and stop sequences. |
Real-Time Streaming Stream responses token-by-token using Server-Sent Events. Build responsive UIs with locally-running models. |
Model Management Pull, show details, list tags, and delete models programmatically. Full lifecycle management from Delphi code. |
|
Embeddings Generate vector embeddings locally. Power semantic search, clustering, and classification without sending data to the cloud. |
Self-Hosted / Configurable Host Connect to any Ollama instance via configurable host URL. Run locally, on a LAN server, or in a private cloud. |
Built-in Retry & Logging Automatic retry on transient failures with configurable attempts and wait intervals. Full request/response logging for debugging. |
Per iniziare
Integra Ollama nel tuo progetto Delphi in meno di un minuto. Trascina il componente, configura l'host e invia il tuo primo messaggio.
// Create the component and configure the host
var
Ollama: TsgcHTTP_API_Ollama;
vResponse: string;
begin
Ollama := TsgcHTTP_API_Ollama.Create(nil);
Try
Ollama.OllamaOptions.Host := 'http://localhost:11434';
// Send a simple message to a local model
vResponse := Ollama._CreateMessage(
'llama3', 'Hello, Ollama!');
ShowMessage(vResponse);
Finally
Ollama.Free;
End;
end;
No API key required. When connecting to a local Ollama instance, no authentication is needed. For remote or secured deployments, you can optionally set an API key via OllamaOptions.ApiKey.
Chat Completions e streaming
The Chat Completions API works with any model you have pulled into your Ollama instance. Send text with optional system prompts, and receive responses synchronously or streamed in real-time.
System Prompts
Control model behavior by providing a system prompt that sets the context, personality, or constraints for the conversation.
vResponse := Ollama._CreateMessageWithSystem(
'llama3',
'You are a helpful assistant that responds in Spanish.',
'What is the capital of France?');
// Returns: "La capital de Francia es París."
Real-Time Streaming
For responsive user interfaces, stream the model's response token-by-token using Server-Sent Events.
// Enable streaming via SSE
Ollama.OnHTTPAPISSE := OnSSEEvent;
Ollama._CreateMessageStream('llama3',
'Write a short poem about Delphi programming.');
procedure TForm1.OnSSEEvent(Sender: TObject;
const aEvent, aData: string; var Cancel: Boolean);
begin
// aData: JSON payload with generated content
Memo1.Lines.Add(aData);
end;
Advanced Typed API
For full control over request parameters — temperature, top-p, stop sequences, max tokens — use the typed request and response classes.
var
oRequest: TsgcOllamaClass_Request_ChatCompletion;
oMessage: TsgcOllamaClass_Request_Message;
oResponse: TsgcOllamaClass_Response_ChatCompletion;
begin
oRequest := TsgcOllamaClass_Request_ChatCompletion.Create;
Try
oRequest.Model := 'llama3';
oRequest.MaxTokens := 2048;
oRequest.Temperature := 0.7;
oMessage := TsgcOllamaClass_Request_Message.Create;
oMessage.Role := 'user';
oMessage.Content := 'Explain quantum computing in simple terms.';
oRequest.Messages.Add(oMessage);
oResponse := Ollama.CreateMessage(oRequest);
Try
if Length(oResponse.Choices) > 0 then
ShowMessage(oResponse.Choices[0].MessageContent);
Finally
oResponse.Free;
End;
Finally
oRequest.Free;
End;
end;
Gestione dei modelli
Manage your entire local model library from Delphi code. Pull new models, inspect their details, list available tags, and delete models you no longer need — all programmatically.
Pull a Model
// Download a model from the Ollama registry
Ollama._PullModel('llama3');
Show Model Details
// Get detailed information about a model
vDetails := Ollama._ShowModel('llama3');
ShowMessage(vDetails);
List Models and Tags
// List all models via OpenAI-compatible endpoint
vModels := Ollama._GetModels;
// List model tags with detailed metadata (name, size, digest)
vTags := Ollama._GetTags;
// Typed API: access tag properties directly
var
oTags: TsgcOllamaClass_Response_Tags;
i: Integer;
begin
oTags := Ollama.GetTags;
Try
for i := 0 to Length(oTags.Models) - 1 do
Memo1.Lines.Add(Format('%s (%d bytes)',
[oTags.Models[i].Name, oTags.Models[i].Size]));
Finally
oTags.Free;
End;
end;
Delete a Model
// Remove a model from the local system
Ollama._DeleteModel('old-model:latest');
Embeddings
Generate vector embeddings locally using any embedding-capable model. Embeddings power semantic search, document clustering, and classification — all without sending data to external servers.
// Generate embeddings locally
var
vEmbedding: string;
begin
vEmbedding := Ollama._CreateEmbeddings(
'nomic-embed-text',
'Delphi is a powerful programming language.');
ShowMessage(vEmbedding);
end;
For full control, use the typed API to access the raw embedding values.
var
oResponse: TsgcOllamaClass_Response_Embeddings;
i: Integer;
begin
oResponse := Ollama.CreateEmbeddings(
'nomic-embed-text',
'Delphi is a powerful programming language.');
Try
for i := 0 to oResponse.EmbeddingCount - 1 do
Memo1.Lines.Add(FloatToStr(oResponse.GetEmbeddingValue(i)));
Finally
oResponse.Free;
End;
end;
Data privacy. With Ollama, your data never leaves your network. This makes it ideal for regulated industries (healthcare, finance, government) where data residency and privacy are critical requirements.
Configurazione e opzioni
Fine-tune the component behavior with comprehensive configuration options.
| Proprietà | Descrizione |
|---|---|
OllamaOptions.Host |
Ollama server URL (e.g., http://localhost:11434) |
OllamaOptions.ApiKey |
Optional API key for secured deployments |
HttpOptions.ReadTimeout |
HTTP read timeout in milliseconds (default: 60000) |
LogOptions.Enabled |
Enable request/response logging |
RetryOptions.Enabled |
Automatic retry on transient failures |
RetryOptions.Retries |
Maximum number of retry attempts (default: 3) |
RetryOptions.Wait |
Wait time between retries in milliseconds (default: 3000) |
Supported Models
Ollama supports hundreds of open-source models. Here are some popular choices:
| Model | Parametri | Best For |
|---|---|---|
llama3 |
8B / 70B | General-purpose chat, reasoning |
mistral |
7B | Fast, efficient text generation |
codellama |
7B / 13B / 34B | Code generation and analysis |
nomic-embed-text |
137M | Text embeddings, semantic search |
Zero cost, full control. Run AI models on your own hardware with no per-token charges. Combined with sgcWebSockets' built-in retry logic and logging, you get production-ready local AI integration for Delphi.
