Using ClickHouse for Real-Time L7 DDoS & Bot Traffic Analytics with Tempesta FW

News

Most open-source L7 DDoS mitigation and bot-protection approaches rely on challenges (e.g., CAPTCHA or JavaScript proof-of-work) or static rules based on the User-Agent, Referer, or client geolocation. These techniques are increasingly ineffective, as they are easily bypassed by modern open-source impersonation libraries and paid cloud proxy networks. We explore a different approach: classifying HTTP client requests in near real time using ClickHouse as the primary analytics backend. We collect access logs directly from Tempesta FW, a high-performance open-source hybrid of an HTTP reverse proxy and a firewall. Tempesta FW implements zero-copy per-CPU log shipping into ClickHouse, so the dataset growth rate is limited only by ClickHouse bulk ingestion performance – which is very high. WebShield, a small open-source Python daemon: periodically executes analytic queries to detect spikes in traffic (requests or bytes per second), response delays, surges in HTTP error codes, and other anomalies; upon detecting a spike, classifies the clients and validates the current model; if the model is validated, automatically blocks malicious clients by IP, TLS fingerprints, or HTTP fingerprints. To simplify and accelerate classification — whether automatic or manual — we introduced a new TLS fingerprinting method. WebShield is a small and simple daemon, yet it is effective against multi-thousand-IP botnets. The full article with configuration examples, ClickHouse schemas, and queries. submitted by /u/krizhanovsky [link] [comments]Technical Information Security Content & DiscussionRead More