提示缓存允许您在 API 调用之间缓存常用上下文,在缓存读取时最多可降低 90% 的成本并改善延迟。缓存内容通过 cache_control 参数标记,并在缓存 TTL 窗口内跨请求复用。
缓存长系统提示以避免每次请求都重新处理。将 SystemCacheControl 设置为 True 可自动用 cache_control 包装系统提示。
Anthropic := TsgcHTTP_API_Anthropic.Create(nil);
Anthropic.AnthropicOptions.ApiKey := 'API_KEY';
oRequest := TsgcAnthropicClass_Request_Messages.Create;
Try
oRequest.Model := 'claude-sonnet-4-20250514';
oRequest.MaxTokens := 4096;
oRequest.System := 'You are an expert legal assistant... (long system prompt)';
oRequest.SystemCacheControl := True; // Enable caching for system prompt
oMessage := TsgcAnthropicClass_Request_Message.Create;
oMessage.Role := 'user';
oMessage.Content := 'Analyze this contract clause.';
oMessages := oRequest.Messages;
SetLength(oMessages, 1);
oMessages[0] := oMessage;
oRequest.Messages := oMessages;
oResponse := Anthropic.CreateMessage(oRequest);
Try
// Check cache usage in response
WriteLn('Cache created: ' +
IntToStr(oResponse.Usage.CacheCreationInputTokens));
WriteLn('Cache read: ' +
IntToStr(oResponse.Usage.CacheReadInputTokens));
WriteLn(oResponse.Content[0].Text);
Finally
oResponse.Free;
End;
Finally
oMessage.Free;
oRequest.Free;
End;
通过在单个内容块上将 CacheControl 设置为 'ephemeral',可以缓存特定内容块(例如大型文档或上下文)。
oBlock := TsgcAnthropicClass_Request_Content_Block.Create;
oBlock.ContentType := 'text';
oBlock.Text := '(large reference text to cache)';
oBlock.CacheControl := 'ephemeral'; // Mark for caching
缓存工具定义,避免在每次请求时重新处理。当您有许多在请求间保持不变的工具定义时,此功能非常有用。
oTool := TsgcAnthropicClass_Request_Tool.Create;
oTool.Name := 'search_database';
oTool.Description := 'Search the database for records.';
oTool.InputSchema := '{"type":"object","properties":{"query":{"type":"string"}}}';
oTool.CacheControl := 'ephemeral'; // Cache this tool definition