Google Vertex AI

Typed interface to call the Google Vertex AI gRPC API to run generative models such as Gemini.

Introduction

Google Vertex AI runs Google's foundation and generative models, including the Gemini family. The gRPC API is exposed through the google.cloud.aiplatform.v1.PredictionService service, with the main methods GenerateContent and Predict.

Vertex AI uses a regional endpoint: the host is <region>-aiplatform.googleapis.com (for example us-central1-aiplatform.googleapis.com) on port 443 over TLS, and the JWT audience must match that same host (for example https://us-central1-aiplatform.googleapis.com/).

Requests are built with TsgcGRPCVertexAIGenerateContentRequest, which sets the fully qualified Model (projects/<id>/locations/<region>/publishers/google/models/<model>, for example gemini-1.5-flash) together with the prompt content.

The example below authenticates with a service-account JWT, wires a TsgcGRPCClient over a TsgcHTTP2Client to the regional Vertex AI host, sets the authorization Bearer metadata and calls GenerateContent with a prompt against a Gemini model in us-central1:


    oHTTP2 := TsgcHTTP2Client.Create(nil);
    oHTTP2.Host := 'us-central1-aiplatform.googleapis.com';
    oHTTP2.Port := 443;
    oHTTP2.TLS := True;

    oGRPC := TsgcGRPCClient.Create(nil);
    oGRPC.Client := oHTTP2;

    // service-account JWT authentication (audience must match the regional host)
    oGRPC.GoogleCloudOptions.JWT.KeyFile := 'service-account.json';
    oGRPC.GoogleCloudOptions.JWT.API_Endpoint := 'https://us-central1-aiplatform.googleapis.com/';
    oGRPC.DefaultMetadata.AddValue('authorization', 'Bearer ' + oGRPC.GoogleCloudOptions.JWT.Token);

    // build the typed request and call the method
    oRequest := TsgcGRPCVertexAIGenerateContentRequest.Create;
    try
      oRequest.Model := 'projects/my-project-id/locations/us-central1/publishers/google/models/gemini-1.5-flash';
      oRequest.AddText('Write a haiku about the sea.');
      oResponse := oGRPC.Call('google.cloud.aiplatform.v1.PredictionService', 'GenerateContent', oRequest.ToBytes);
      ShowMessage(oResponse.DataString);
    finally
      oRequest.Free;
    end;

Methods

NameDescription
GenerateContentGenerates content from a generative model (such as Gemini) given a prompt.
PredictRuns a prediction against a deployed model.

Demo

A working sample is available in the demo folder Demos/21.GRPC/17.Vertex_AI, which shows how to authenticate against a regional endpoint and generate content from a Gemini model with the GenerateContent method.

See Also