Vector Databases
Connect to vector databases from Delphi for semantic search, RAG, and AI-powered applications. Support for Pinecone and more.
Connect to vector databases from Delphi for semantic search, RAG, and AI-powered applications. Support for Pinecone and more.
Store text as high-dimensional embeddings and retrieve the most relevant passages by meaning, not keywords.
A vector database stores the numeric embeddings produced by an embedding model and lets you find the entries closest to a query vector. This is the foundation of semantic search and Retrieval-Augmented Generation (RAG), where you ground a large language model with passages pulled from your own documents instead of relying only on what the model memorised during training.
sgcWebSockets ships two interchangeable vector-store backends that share the same base component, TsgcAIDatabaseVector, so you can swap one for the other without changing your ingest or query code. Pair either backend with the TsgcAIOpenAIEmbeddings component to turn raw text into vectors and push it straight into the store.
When to use which: reach for the file backend when you want zero infrastructure and the data fits comfortably on the machine. Choose Pinecone when the index is large, must be shared across processes or users, or needs to scale beyond a single host.
Assign a vector store to the Database property of a TsgcAIOpenAIEmbeddings component, then call CreateEmbeddingsFromFile to embed and ingest a whole document in a single batch. Internally each chunk is added through the BeginAddData / AddData / EndAddData sequence the store inherits from TsgcAIDatabaseVector, so the file and Pinecone backends behave identically from your code's point of view.
At query time you embed the user's question with GetEmbedding and pass the resulting vector to QueryData, which returns the nearest matches ranked by cosine similarity. Feed those passages back to a chat model as context and you have a working RAG pipeline: answers grounded in your own data, with citations you control. The same approach powers semantic search over knowledge bases, deduplication, recommendation, and clustering, all without leaving Delphi or C++ Builder.
Ingest a corpus and query the nearest neighbour with either backend.
uses
sgcAI, sgcAI_OpenAI_Embeddings,
sgcAI_DB_Vector, sgcAI_DB_Vector_File, sgcAI_DB_Vector_Pinecone;
var
Embeddings: TsgcAIOpenAIEmbeddings;
DBFile: TsgcAIDatabaseVectorFile;
DBPinecone: TsgcAIDatabaseVectorPinecone;
begin
Embeddings := TsgcAIOpenAIEmbeddings.Create(nil);
Embeddings.OpenAIOptions.ApiKey := 'sk-...';
// Local file-based vector store
DBFile := TsgcAIDatabaseVectorFile.Create(nil);
DBFile.VectorFileOptions.InputFilename := 'corpus.sgcif';
DBFile.VectorFileOptions.VectorFilename := 'corpus.sgcvf';
Embeddings.Database := DBFile;
Embeddings.CreateEmbeddingsFromFile('docs.txt');
// Or push to the Pinecone cloud index
DBPinecone := TsgcAIDatabaseVectorPinecone.Create(nil);
DBPinecone.PineconeOptions.ApiKey := 'pc-...';
DBPinecone.PineconeIndexOptions.IndexName := 'sgc-embeddings';
Embeddings.Database := DBPinecone;
// Query the nearest neighbour for an arbitrary text
Results := Embeddings.Database.QueryData(
Embeddings.GetEmbedding('what is sgcWebSockets?', ''));
end;