Gemini

Google Gemini is a family of multimodal AI models developed by Google DeepMind. Gemini models support text generation, vision, structured outputs, embeddings, and tool use, offering powerful capabilities for building AI-powered applications.

The sgcWebSockets library provides a Delphi component TsgcHTTP_API_Gemini to interact with the Gemini API.

Gemini API

The Gemini API provides access to Google Gemini models for building AI-powered applications. The API supports content generation, vision (image understanding), structured JSON outputs, streaming, token counting, embeddings, tool use (function calling), and model listing.

Features

Messages
- Gemini Messages Examples

Vision
- Gemini Vision Examples

Models
- Gemini Models Examples

Structured Outputs
- Gemini Structured Outputs Examples

Token Counting
- Gemini Token Counting Examples

Embeddings
- Gemini Embeddings Examples

Tool Use
- Gemini Tool Use Examples

Configuration

The Gemini API uses API keys for authentication. Visit your API Keys page in Google AI Studio to retrieve the API key you'll use in your requests.

Remember that your API key is a secret! Do not share it with others or expose it in any client-side code.

This API Key must be configured in the GeminiOptions.ApiKey property of the component.


Gemini := TsgcHTTP_API_Gemini.Create(nil);
Gemini.GeminiOptions.ApiKey := 'YOUR_API_KEY';

Messages

Send content to a Gemini model and receive generated responses. The model generates the next message based on the provided input.

_CreateContent: Creates content with the specified model and user prompt.
- Model: The model to use (e.g. gemini-2.0-flash).
- Message: The user message content.
- MaxOutputTokens: Maximum number of tokens to generate.
_CreateContentWithSystem: Creates content with a system instruction.
- Model: The model to use.
- System: System instruction that sets the behavior of the model.
- Message: The user message content.
- MaxOutputTokens: Maximum number of tokens to generate.
_CreateContentStream: Creates content with streaming (SSE) enabled. Events are delivered through the OnHTTPAPISSE event handler.

Vision

Gemini models can understand images passed as base64-encoded content along with text prompts.

_CreateVisionContent: Sends an image with a text prompt.
- Model: The model to use.
- Prompt: The text prompt to accompany the image.
- Base64: The base64-encoded image data.
- MediaType: The MIME type (image/jpeg, image/png, image/gif, image/webp).
- MaxOutputTokens: Maximum number of tokens to generate.

Structured Outputs

Generate structured JSON output from a Gemini model by providing a JSON schema that defines the expected response format.

_CreateContentJSON: Creates content with structured JSON output.
- Model: The model to use.
- Message: The user message content.
- Schema: A JSON schema defining the expected output structure.
- MaxOutputTokens: Maximum number of tokens to generate.

Models

List and retrieve details about available Gemini models.

_GetModels: Lists all available models.
_GetModel: Gets details for a specific model.
- ModelId: The identifier of the model to retrieve.

Token Counting

Count the number of tokens in a message before sending it to a model.

_CountTokens: Counts tokens for a message.
- Model: The model to use for tokenization.
- Message: The text content to count tokens for.

Embeddings

Generate vector embeddings for text content using Gemini models.

_EmbedContent: Generates embeddings for text.
- Model: The model to use for embedding generation.
- Text: The text content to generate embeddings for.