Google Cloud Vision analyzes images and tells you what is in them: it returns descriptive labels, reads printed and handwritten text (OCR), detects faces, recognises famous landmarks and logos, and more. sgcWebSockets Enterprise ships a typed Vision gRPC client that sits on top of TsgcGRPCClient, so you can send an image and read back its annotations from Delphi and C++Builder without any external runtime or hand-written protobufs.
How it works
Cloud Vision exposes an ImageAnnotator gRPC service. A request is a batch of one or more images, each paired with the list of features you want detected, and the response is a matching batch of annotations. The whole exchange is Protocol Buffers messages framed over HTTP/2.
sgcWebSockets already ships a complete HTTP/2 stack, so the transport is a TsgcHTTP2Client pointed at vision.googleapis.com:443 over TLS. TsgcGRPCClient sits on top of it and does the gRPC framing, headers, timeouts and trailer parsing. The Vision message classes in sgcGRPC_Google_Vision are typed protobuf helpers: you fill in the request object, call ToBytes to serialize it, send it with the generic client, and load the reply bytes back into a typed response object with LoadFromBytes.
Authentication is the standard Google service-account flow. A self-signed JWT is exchanged for a bearer token and sent as gRPC metadata on every call. Because the JWT is audience-bound, the token is targeted at the Vision endpoint so vision.googleapis.com accepts it.
Setting up the clients
Create the HTTP/2 transport, attach the gRPC client to it, and tell the channel to use the binary protobuf content type. Vision speaks application/grpc+proto.
uses
sgcHTTP2_Client, sgcGRPC_Types, sgcGRPC_Client,
sgcGRPC_Google_Vision;
var
HTTP2: TsgcHTTP2Client;
GRPC: TsgcGRPCClient;
begin
HTTP2 := TsgcHTTP2Client.Create(nil);
HTTP2.Host := 'vision.googleapis.com';
HTTP2.Port := 443;
HTTP2.TLS := True;
GRPC := TsgcGRPCClient.Create(nil);
GRPC.Client := HTTP2;
GRPC.ChannelOptions.ContentType := grpcProto;
GRPC.ChannelOptions.Compression := grpcNoCompression;
end;
Authentication
The bearer token comes from your service-account credentials. Configure the JWT options with the values from the downloaded JSON key, point the audience at the Vision endpoint, and add the resulting token to DefaultMetadata so it travels on every gRPC call.
// configure the service-account JWT (values from the JSON key file)
CloudClient.GoogleCloudOptions.Authentication := gcaJWT;
CloudClient.GoogleCloudOptions.JWT.ClientEmail := ClientEmail;
CloudClient.GoogleCloudOptions.JWT.PrivateKeyId := PrivateKeyId;
CloudClient.GoogleCloudOptions.JWT.PrivateKey.Text := PrivateKey;
CloudClient.GoogleCloudOptions.JWT.ProjectId := ProjectId;
// the self-signed JWT is audience-bound to the Vision endpoint
CloudClient.GoogleCloudOptions.JWT.API_Endpoint :=
'https://vision.googleapis.com/';
// once the token is acquired, send it as gRPC metadata
GRPC.DefaultMetadata.Clear;
GRPC.DefaultMetadata.Add('authorization', 'Bearer ' + Token);
Annotating an image
To analyze an image you build a TsgcGRPCVisionBatchAnnotateImagesRequest, add one image request, point it at the image (a Google Cloud Storage URI here, but you can also send raw bytes), and add one or more features. Each feature has a FeatureType and an optional MaxResults. Serialize with ToBytes and call the BatchAnnotateImages method of the ImageAnnotator service.
var
oRequest: TsgcGRPCVisionBatchAnnotateImagesRequest;
oImgReq: TsgcGRPCVisionAnnotateImageRequest;
oFeature: TsgcGRPCVisionFeature;
oResponse: TsgcGRPCResponse;
begin
oRequest := TsgcGRPCVisionBatchAnnotateImagesRequest.Create;
try
oImgReq := oRequest.AddRequest;
oImgReq.Image.Source.GcsImageUri :=
'gs://cloud-samples-data/vision/demo-image.jpg';
oFeature := oImgReq.AddFeature;
oFeature.FeatureType := 4; // LABEL_DETECTION
oFeature.MaxResults := 10;
oResponse := GRPC.Call('google.cloud.vision.v1.ImageAnnotator',
'BatchAnnotateImages', oRequest.ToBytes);
finally
oRequest.Free;
end;
end;
The FeatureType values follow the Vision API enum: 1 FACE_DETECTION, 2 LANDMARK_DETECTION, 3 LOGO_DETECTION, 4 LABEL_DETECTION, 5 TEXT_DETECTION, 6 DOCUMENT_TEXT_DETECTION, and the rest. To run several detections on the same image in one round trip, add more than one feature.
Reading the annotations
The reply bytes load straight into a typed batch response. Each image in the batch carries separate lists for label, landmark, logo and text annotations. Every entry is a TsgcGRPCVisionEntityAnnotation with a Description and, for labels, a confidence Score.
var
oResponse: TsgcGRPCVisionBatchAnnotateImagesResponse;
oImgResp: TsgcGRPCVisionAnnotateImageResponse;
i, j: Integer;
begin
oResponse := TsgcGRPCVisionBatchAnnotateImagesResponse.Create;
try
oResponse.LoadFromBytes(aData);
for i := 0 to oResponse.ResponseCount - 1 do
begin
oImgResp := oResponse.Response(i);
for j := 0 to oImgResp.LabelAnnotationCount - 1 do
Memo1.Lines.Add('Label: ' + oImgResp.LabelAnnotation(j).Description +
' (score: ' + FloatToStr(oImgResp.LabelAnnotation(j).Score) + ')');
for j := 0 to oImgResp.TextAnnotationCount - 1 do
Memo1.Lines.Add('Text: ' + oImgResp.TextAnnotation(j).Description);
for j := 0 to oImgResp.LandmarkAnnotationCount - 1 do
Memo1.Lines.Add('Landmark: ' + oImgResp.LandmarkAnnotation(j).Description);
end;
finally
oResponse.Free;
end;
end;
Images from storage or from bytes
The image source is flexible. Set Image.Source.GcsImageUri for an object in Google Cloud Storage, or Image.Source.ImageUri for a public HTTP(S) URL. To annotate a local file, read it into a TBytes and assign it to Image.Content instead, so the picture travels inline in the request. The same request and response classes handle every source.
Synchronous or asynchronous
The example above uses the blocking Call, which returns a TsgcGRPCResponse with the StatusCode, the raw Data bytes and the trailers. To keep the UI responsive, use CallAsync and handle the reply in the OnGRPCResponse event, where you parse aResponse.Data exactly the same way. A non-OK status surfaces through OnGRPCError, and a transport failure through OnGRPCException.
Availability
The typed Vision gRPC client is part of the sgcWebSockets Enterprise edition and runs on Windows, macOS, Linux, iOS and Android. A ready-to-run sample is in Demos\21.GRPC\13.Vision, and the full reference is on the gRPC Client product page.
Questions or feedback? Get in touch. You will get a reply from the people who wrote the code.
