Skip to main content
SecurityJune 4, 20267 min read

Your HTTP/2 Server Can Be OOM-Killed In Under A Minute: What To Do Now

A newly published exploit combines two HTTP/2 weaknesses to exhaust 32 GB of server RAM in under 20 seconds, no authentication required. nginx, Apache, IIS, and Envoy are all affected. Here is what the attack does, who is exposed, and the exact patch or config change that closes it.

A Single Laptop Can Crash Your Web Server In Less Than A Minute

Security researcher Quang Luong published a denial-of-service exploit in June 2026 that combines two long-known HTTP/2 weaknesses in a way that had not been weaponised before. The result is a technique that can exhaust 32 GB of server RAM in roughly 10 to 45 seconds depending on the target, with no credentials, no prior access, and no sustained bandwidth required. The attack works against nginx, Apache httpd, Microsoft IIS, Envoy, and Cloudflare Pingora - five of the most widely deployed HTTP/2 stacks in existence.

The practical framing matters more than the CVE number: if your server speaks HTTP/2, which every modern web stack does by default, this is a remote kill switch an attacker can pull from a laptop with no special tooling.

How The Exploit Actually Works

The attack chains two components that individually were well known but separately considered manageable.

Component 1: HPACK header table amplification. HTTP/2 compresses headers using a shared dynamic table on both sides of the connection. The exploit first seeds that table with large header values (for example, a multi-kilobyte cookie), then issues hundreds of requests that reference the same table entries using single-byte index codes. Each reference forces the server to materialise a full in-memory copy of the referenced header. The amplification ratios are severe: Envoy 1.37.2 reaches roughly 5,700 server-side bytes allocated per byte sent by the attacker; Apache httpd 2.4.67 reaches roughly 4,000 to 1.

Component 2: Flow-control window hold. HTTP/2 has a flow-control mechanism where the receiver can advertise a zero-byte window, telling the sender to pause. The exploit sets the client receive window to zero and keeps it there, issuing periodic WINDOW_UPDATE frames timed just frequently enough to reset the server's idle timeout. The server cannot complete its responses, so it holds all of the allocated request memory indefinitely rather than freeing it after the response cycle.

Cookie header splitting (Apache and Envoy bypass). Both servers cap the number of header fields per request. The exploit works around this by splitting a single large cookie into dozens of individual Cookie headers, which is explicitly permitted by RFC 9113 and therefore cannot simply be rejected without breaking legitimate traffic.

The combination turns a low-volume trickle of requests into runaway memory allocation the server cannot shed until it crashes or is restarted.

The Numbers By Server

ServerAmplification ratioTime to exhaust 32 GB
Envoy 1.37.2~5,700:1~10 seconds
Apache httpd 2.4.67~4,000:1~18 seconds
nginx 1.29.7~70:1~45 seconds
Microsoft IIS~68:145 seconds (64 GB)

Even the lowest amplification ratio (nginx at 70:1) is enough to bring down a 32 GB node in under a minute from a single client connection.

Who Is At Risk

Any internet-facing server that accepts HTTP/2 connections is exposed. The attack requires no credentials and no prior knowledge of the application. The population of publicly reachable HTTP/2 servers is estimated at over 880,000, though a significant share is behind CDN proxies that may absorb the impact (Cloudflare Pingora itself is listed as affected, so "behind a CDN" is not an unconditional safety guarantee).

The attack is particularly dangerous for:

  • VPS and bare-metal hosts without cgroup or ulimit constraints where a single OOM event crashes the entire server, not just the web process.
  • Kubernetes deployments without per-pod memory limits where the node can be taken down rather than just the affected pod.
  • High-availability setups where the attacker restarts the attack on each recovered instance, cycling the service indefinitely.
  • Servers handling mixed workloads (web, API, internal services) where crashing the web process also kills co-located services.

Servers already sitting near their memory ceiling during peak hours are at particular risk: the attack can push a marginal server over the edge with far fewer requests than the worst-case figures suggest.

Root Cause: A Specification Defect

This is not a simple implementation bug that one vendor missed. The underlying cause is a design defect in RFC 7541, the HPACK spec. The specification provides SETTINGS_HEADER_TABLE_SIZE as the control knob for the dynamic table, but it does not account for per-entry allocator overhead or the cost of materialising referenced entries on the receiver side. Each implementation added its own limits, but none of those limits were calibrated against the amplification paths the exploit uses.

The flow-control hold is also specified behaviour: RFC 7540 permits a receiver to set a zero window and refresh it indefinitely. The protocol was designed for flow control, not as a backpressure mechanism against a malicious sender.

What To Do: Patches And Mitigations

nginx: Upgrade to 1.29.8 or later, which adds per-connection header-table-size enforcement. If immediate upgrade is not possible, add http2_max_field_size and http2_max_header_size directives with conservative limits and set http2_recv_timeout to a low value (5-10 seconds).

Apache httpd: Upgrade to 2.4.64 or later. That release fixed CVE-2025-53020 (CVSS 7.5, CWE-401, "late release of memory after effective lifetime"), a closely related HTTP/2 memory-handling flaw affecting versions 2.4.17 through 2.4.63, and is the baseline you want to be on. A further CVE has reportedly been assigned for this specific amplification chain but was still in reserved status (not yet published in the NVD) at the time of writing, so confirm the exact fixed version against the official Apache advisory before relying on it. If patching is delayed, H2MaxDataFrameLen and H2MaxHeaderListSize combined with a short Timeout directive reduce exposure.

Envoy: No upstream patch was available at initial disclosure. The recommended mitigation is to enforce header list size limits via the http2_protocol_options in the listener config (max_inbound_headers_kb, initial_connection_window_size, initial_stream_window_size) and to set a connection-level idle timeout that cannot be reset by WINDOW_UPDATE frames alone.

Microsoft IIS: No patch at initial disclosure. IIS does not expose the relevant tuning knobs at the configuration level. Placing a WAF or reverse proxy in front that enforces header count and size limits is the practical mitigation until Microsoft ships a fix.

All servers - container and OS-level memory caps: Regardless of the above, enforce memory limits at the process level so that an OOM kills the web worker and triggers an automatic restart rather than taking the entire node down. On Linux this means cgroups (for containers, set resources.limits.memory in the pod spec; for bare-metal, use a systemd MemoryMax= directive). On Windows, Application Pool recycling on private bytes threshold achieves a similar result.

The Broader Lesson

This exploit illustrates a recurring pattern in internet infrastructure vulnerabilities: a specification provides a feature for legitimate use (dynamic compression, flow control), implementations do not fully account for adversarial use of that feature, and years later someone maps the amplification surface carefully enough to weaponise it. The HTTP/2 dynamic header table has existed since 2015. The flow-control hold technique is not new. What is new is the precise combination and the published proof-of-concept.

The operational implication is not that HTTP/2 should be disabled (the performance benefits are real and HTTP/3 introduces similar surfaces). It is that memory limits at the process, container, and OS level are not optional hardening - they are the backstop that converts a complete server compromise into a recoverable restart. If your web workers run without memory caps, this exploit (and any future variant) can cascade across your entire node fleet. If they run with caps, the blast radius is one pod restart.

For teams that want to verify their current exposure, check the memory limit configuration on your web processes first - that is the single highest-leverage change regardless of which server software you run. For a full audit of your HTTP/2 configuration and server hardening posture, our server management and hardening service covers exactly this class of configuration risk, including flow-control tuning, resource limit enforcement, and patch cadence management.

Sources

Technical details (amplification ratios, time-to-exhaustion figures, and the cookie-splitting bypass) are drawn from the original researcher disclosure and have not been independently reproduced by us. The Apache memory-handling baseline references CVE-2025-53020 as published in the NVD (Apache HTTP Server 2.4.17 to 2.4.63, fixed in 2.4.64, CVSS 7.5). The root-cause analysis references RFC 7541 (HPACK) and RFC 7540 (HTTP/2 flow control). Verify exact patched versions against each vendor's official advisory before deploying. For context on the wider infrastructure security landscape this month, see our note on the Spectra Gutenberg Blocks RCE and how low-privilege vulnerabilities escalate when process-level constraints are absent.

Want to learn more?

Get in touch with our team to discuss how we can help your infrastructure.