Faster WebSocket Compression

All Features

Admin

Wednesday, 01 April 2026

WebSocket compression is essential for reducing bandwidth and improving responsiveness, especially when transmitting repetitive data like JSON payloads. The permessage-deflate extension compresses every WebSocket frame on the fly — but the speed of that compression directly impacts your application's throughput.

Starting with sgcWebSockets 2026.4.0, the permessage-deflate implementation has been completely rewritten for significantly faster performance. In our benchmarks, small messages compress and decompress up to 15x faster, with measurable gains across all payload sizes.

What Changed?

The previous implementation initialized and destroyed the compression engine on every single WebSocket frame. This meant that even a tiny 1 KB message paid the full cost of setting up the compressor, compressing the data, and then tearing everything down — only to repeat the entire process for the next message.

The new implementation keeps the compression engine alive across frames. It is initialized once when the first frame arrives and reused for the lifetime of the connection. This eliminates the per-frame setup overhead and also allows the engine to learn from previous messages, resulting in faster compression of repetitive data patterns.

In addition to the persistent compression context, the new implementation includes several other optimizations:

Pre-allocated memory buffers — Buffers are allocated once and reused, avoiding repeated memory allocation on every frame.
Direct memory access — When the input is already in memory, the engine reads it directly without copying it into intermediate buffers first.
Reused temporary streams — Internal working streams are created once in the constructor instead of being created and destroyed on every compress/decompress call.

Benchmark Results

We ran 10,000 compress + decompress round-trips for each message size. Every round-trip compresses a JSON payload and then decompresses it back, verifying the output matches the original. The test was performed on a Windows 64-bit machine compiled with Delphi 12 Athens.

Default Configuration (persistent context)

This is the default mode where the compression context is maintained across frames — the most common real-world scenario:

Message Size	Previous (ms)	New (ms)	Speedup
1 KB	437 ms	28 ms	15.6x faster
4 KB	480 ms	88 ms	5.5x faster
16 KB	546 ms	431 ms	1.3x faster
64 KB	1,994 ms	1,725 ms	1.2x faster

With NoContextTakeOver (independent frames)

When NoContextTakeOver is enabled, each frame is compressed independently. Even in this mode, the buffer reuse and direct memory access optimizations provide a solid improvement:

Message Size	Previous (ms)	New (ms)	Speedup
1 KB	149 ms	75 ms	2.0x faster
4 KB	173 ms	100 ms	1.7x faster
16 KB	302 ms	228 ms	1.3x faster
64 KB	1,216 ms	1,094 ms	1.1x faster

Who Benefits Most?

The improvement is most dramatic for applications that exchange many small messages — which is exactly the typical WebSocket use case:

Chat & Messaging Short text messages (typically under 4 KB) see the biggest gains: 5–15x faster compression.	Real-time Data Feeds JSON updates for dashboards, stock tickers, and IoT sensors benefit from both speed and the persistent context learning repetitive patterns.
Gaming & Multiplayer Frequent small state updates benefit from the low per-frame overhead.	High-Concurrency Servers Less CPU time per frame means the server can handle more simultaneous connections.

Fully Compatible

The optimization is fully transparent — no code changes are needed in your application. The compressed data on the wire is identical to the previous version, so upgraded servers work seamlessly with existing clients and vice versa.

The new implementation supports all platforms and compilers:

Delphi 7 through Delphi 13 (including C++Builder)
Windows, macOS, Linux, Android, iOS
32-bit and 64-bit targets

Upgrade to 2026.4.0

The permessage-deflate optimization is available in sgcWebSockets 2026.4.0. Simply update to the latest version and your WebSocket connections will automatically benefit from faster compression. Download at esegece.com.

Special thanks to Michael for contributing the initial optimized implementation that inspired this work. His research into persistent zlib contexts and direct memory access laid the foundation for these performance improvements.

Tags:

Delphi CBuilder .NET

Products

Faster WebSocket Compression

What Changed?

Benchmark Results

Default Configuration (persistent context)

With NoContextTakeOver (independent frames)

Who Benefits Most?

Fully Compatible

Upgrade to 2026.4.0

About the author

Admin

Author's recent posts

Delphi	C++ Builder	Lazarus	.NET

Faster WebSocket Compression

What Changed?

Benchmark Results

Default Configuration (persistent context)

With NoContextTakeOver (independent frames)

Who Benefits Most?

Fully Compatible

Upgrade to 2026.4.0

About the author

Admin

Author's recent posts

Related Posts

HTTP.SYS High Performance Tuning

sgcWebSockets 2026.4

Mistral Delphi API Client

Ollama Delphi API Client

DeepSeek Delphi API Client