sgcWebSockets 2026.6 — Bot Detection in the Firewall

22 May 2026 · Components

“Can I identify bots by IP?” is one of the most common questions we get about server-side filtering. The honest answer is: the IP alone is a starting point, not a verdict — but combined with a few well-chosen signals it becomes a reliable one. sgcWebSockets 2026.6 adds a new Bot Detection module to the TsgcWebSocketFirewall component that does exactly that, classifying every connecting IP using four independent signals.

Bot Detection sits next to the firewall features you already know — blacklist / whitelist, GeoIP, brute-force, rate limiting, threat scoring — and follows the same component-property design. This post walks through how it works and how to wire it up.

Four signals, one classification

The module evaluates each IP against four independent signals and resolves a single TsgcBotClassification value:

Known bot IP ranges — match the IP against published crawler ranges (Googlebot, Bingbot, GPTBot, ClaudeBot…) loaded from a CIDR file. A match means a genuine, declared crawler.
Datacenter / hosting detection — flag IPs that belong to cloud and VPS provider ranges (AWS, Hetzner, OVH…). Real users rarely connect from datacenter ASNs; automated traffic very often does.
Forward-confirmed reverse DNS (FCrDNS) — do a PTR lookup on the IP, then forward-confirm the resulting hostname back to the same IP. This is the only reliable way to tell a real Googlebot from anything that merely claims to be one in a header.
DNSBL reputation — query DNS blocklists (Spamhaus and others) for IPs with a known-abusive history.

The result is one of six values:

TsgcBotClassification = (bcUnknown, bcVerifiedCrawler, bcDatacenter,
  bcSuspectedBot, bcBlocklisted, bcHuman);

IP-only detection has real limits — addresses are shared behind CGNAT, they rotate, and sophisticated bots hide inside residential proxy pools. So Bot Detection is built to give you a good first-pass score, not an absolute truth. Which is why, by design, it never blocks a connection on its own.

Enabling Bot Detection

Everything lives under the new BotDetection property. Turn on the signals you want and point the file-based ones at their data:

uses
  sgcWebSocket_Server_Firewall;

var
  oFirewall: TsgcWebSocketFirewall;
begin
  oFirewall := TsgcWebSocketFirewall.Create(nil);

  oFirewall.BotDetection.Enabled := True;

  // 1. Verified crawler IP ranges  (file lines: CIDR,botname)
  oFirewall.BotDetection.KnownBotsFile := 'C:\firewall\known-bots.csv';

  // 2. Datacenter / hosting ranges  (file lines: CIDR,asn-name)
  oFirewall.BotDetection.Datacenter.Enabled := True;
  oFirewall.BotDetection.Datacenter.ASNFile := 'C:\firewall\datacenter-ranges.csv';

  // 3. Forward-confirmed reverse DNS verification
  oFirewall.BotDetection.ReverseDNS.Enabled := True;

  // 4. DNSBL reputation lookups
  oFirewall.BotDetection.DNSBL.Enabled := True;
  oFirewall.BotDetection.DNSBL.Zones.Add('zen.spamhaus.org');

  // Resolution tuning
  oFirewall.BotDetection.CacheDurationSec := 3600;   // cache a verdict for 1h
  oFirewall.BotDetection.DNSTimeoutMS     := 2000;

  oServer.Firewall := oFirewall;
end;

The two file-based signals reuse the same fast, binary-searched CIDR engine that powers the firewall’s GeoIP database, so even large range lists resolve in microseconds.

Reverse DNS and DNSBL never block the accept thread

Reverse DNS and DNSBL are live DNS queries — running them inline on the connection-accept thread would be a self-inflicted denial of service. Instead, Bot Detection resolves them on a background worker thread and serves every classification from a cache. On a cache miss the IP is queued for resolution and the current call returns bcUnknown immediately; when the lookup finishes the cache is updated and the OnBotDetected event fires. The CacheDurationSec property controls how long a verdict is reused before the IP is re-checked.

Classify-only: OnBotDetected

Bot Detection is classify-only. It never denies a connection by itself — it fires an event, updates the firewall statistics (a new fvBot counter), and leaves the policy decision to you. That keeps a misclassified verified crawler from being silently dropped.

procedure TForm1.FirewallBotDetected(Sender: TObject; const aIP: string;
  aClassification: TsgcBotClassification; const aBotName: string);
begin
  case aClassification of
    bcVerifiedCrawler:
      DoLog(aIP + ' is a verified crawler: ' + aBotName);
    bcDatacenter:
      DoLog(aIP + ' originates from a datacenter / hosting range');
    bcBlocklisted:
      DoLog(aIP + ' is listed on a DNSBL');
  end;
end;

You can also ask for a verdict at any time with GetBotClassification — handy for tailoring a response rather than filtering a connection:

if oFirewall.GetBotClassification(vClientIP) = bcVerifiedCrawler then
  // serve a lightweight, cache-friendly variant to the crawler
  ServeStaticSnapshot
else
  ServeFullApp;

Acting on a verdict: the Custom Rules engine

When you do want to block, you opt in explicitly through the firewall’s existing Custom Rules engine. Every TsgcFirewallRuleItem gains a new BotType property: set it, and the rule matches connections whose classification equals that value. You can then deny, ban or log them — reusing all the rule machinery you already have.

var
  oRule: TsgcFirewallRuleItem;
begin
  oFirewall.CustomRules.Enabled := True;

  // Ban any IP listed on a DNSBL for one hour
  oRule := oFirewall.CustomRules.Rules.Add as TsgcFirewallRuleItem;
  oRule.Name           := 'ban-blocklisted-bots';
  oRule.BotType        := bcBlocklisted;
  oRule.ActionType     := raBan;
  oRule.BanDurationSec := 3600;
end;

This split — passive classification in the Bot Detection module, active enforcement only through a rule you write — means the feature can never surprise you by blocking traffic you wanted to keep.

Bring your own verdict: OnResolveBot

If you already run a threat-intelligence service, an internal reputation cache, or a commercial bot-mitigation feed, assign the OnResolveBot event. When it is set, your handler supplies the verdict and the built-in resolver steps aside — the same override pattern the firewall already uses for GeoIP’s OnResolveCountry.

procedure TForm1.FirewallResolveBot(Sender: TObject; const aIP: string;
  var aClassification: TsgcBotClassification; var aBotName: string);
begin
  // Consult your own threat feed instead of the built-in resolver
  if MyThreatFeed.IsMaliciousBot(aIP) then
  begin
    aClassification := bcBlocklisted;
    aBotName        := 'internal-threat-feed';
  end;
end;

Demo

The Firewall demo in Demos\04.WebSocket_Other_Samples\13.Firewall has a new Bot Detection tab where you can switch each signal on, load the known-bot and datacenter range files, configure DNSBL zones, and use a live “Classify IP” tester to see the verdict for any address — with every classification streamed to the event log.

Availability

Bot Detection ships in sgcWebSockets 2026.6 and is a drop-in addition: the BotDetection property defaults to disabled, so existing projects are unaffected until you turn it on. The feature is available for Delphi (7 through 13), C++Builder and the .NET edition.

Customers with an active subscription can download the new build from the customer area. Trial users can grab the updated installer at esegece.com/products/websockets/download.

Questions, feedback or migration help? Get in touch — you will get a reply from the people who wrote the code.