Vector Databases

Connect to vector databases from Delphi for semantic search, RAG, and AI-powered applications. Support for Pinecone and more.

Semantic Search & RAG in Delphi

Store text as high-dimensional embeddings and retrieve the most relevant passages by meaning, not keywords.

A vector database stores the numeric embeddings produced by an embedding model and lets you find the entries closest to a query vector. This is the foundation of semantic search and Retrieval-Augmented Generation (RAG), where you ground a large language model with passages pulled from your own documents instead of relying only on what the model memorised during training.

sgcWebSockets ships two interchangeable vector-store backends that share the same base component, TsgcAIDatabaseVector, so you can swap one for the other without changing your ingest or query code. Pair either backend with the TsgcAIOpenAIEmbeddings component to turn raw text into vectors and push it straight into the store.

  • TsgcAIDatabaseVectorFile — a local, file-based store. No external service, ideal for desktop apps, offline use, and smaller corpora.
  • TsgcAIDatabaseVectorPinecone — the managed Pinecone cloud service via its REST API, for large, shared, or horizontally scaled indexes.

When to use which: reach for the file backend when you want zero infrastructure and the data fits comfortably on the machine. Choose Pinecone when the index is large, must be shared across processes or users, or needs to scale beyond a single host.

  • Store and query high-dimensional vector embeddings
  • Semantic similarity search for RAG applications
  • Multiple vector database backend support
  • Metadata filtering and hybrid search
  • Batch upsert and query operations

How it works

Assign a vector store to the Database property of a TsgcAIOpenAIEmbeddings component, then call CreateEmbeddingsFromFile to embed and ingest a whole document in a single batch. Internally each chunk is added through the BeginAddData / AddData / EndAddData sequence the store inherits from TsgcAIDatabaseVector, so the file and Pinecone backends behave identically from your code's point of view.

At query time you embed the user's question with GetEmbedding and pass the resulting vector to QueryData, which returns the nearest matches ranked by cosine similarity. Feed those passages back to a chat model as context and you have a working RAG pipeline: answers grounded in your own data, with citations you control. The same approach powers semantic search over knowledge bases, deduplication, recommendation, and clustering, all without leaving Delphi or C++ Builder.

Delphi Example

Ingest a corpus and query the nearest neighbour with either backend.

uses
  sgcAI, sgcAI_OpenAI_Embeddings,
  sgcAI_DB_Vector, sgcAI_DB_Vector_File, sgcAI_DB_Vector_Pinecone;

var
  Embeddings: TsgcAIOpenAIEmbeddings;
  DBFile: TsgcAIDatabaseVectorFile;
  DBPinecone: TsgcAIDatabaseVectorPinecone;
begin
  Embeddings := TsgcAIOpenAIEmbeddings.Create(nil);
  Embeddings.OpenAIOptions.ApiKey := 'sk-...';

  // Local file-based vector store
  DBFile := TsgcAIDatabaseVectorFile.Create(nil);
  DBFile.VectorFileOptions.InputFilename  := 'corpus.sgcif';
  DBFile.VectorFileOptions.VectorFilename := 'corpus.sgcvf';
  Embeddings.Database := DBFile;
  Embeddings.CreateEmbeddingsFromFile('docs.txt');

  // Or push to the Pinecone cloud index
  DBPinecone := TsgcAIDatabaseVectorPinecone.Create(nil);
  DBPinecone.PineconeOptions.ApiKey         := 'pc-...';
  DBPinecone.PineconeIndexOptions.IndexName := 'sgc-embeddings';
  Embeddings.Database := DBPinecone;

  // Query the nearest neighbour for an arbitrary text
  Results := Embeddings.Database.QueryData(
    Embeddings.GetEmbedding('what is sgcWebSockets?', ''));
end;

Ready to Use Vector Databases?

Download the free trial and start building in minutes.