Following the release of the slowhttptest tool with Slow Read DoS attack support, I helped several users test their setups. One of the emails that I received asked me to take a look at test results of the slowhttptest tool. According to the report, the tool brought a service to its knees after only several seconds. This was quite surprising because the system in question was designed to handle requests from several thousands of clients from all across the world, whereas the tool was only opening 1000 concurrent connections.
To put things in context it’s worth mentioning that the organization has around 60,000 registered users with a client application installed on their end. That application communicates periodically to the central service, receiving instructions and reporting the results in short bursts (no, it’s not a botnet as you might have thought!). There are around 2,000 running clients at any given time, and several hundreds of them are normally connected concurrently.
Initially, I was pretty skeptical about being able to DoS the central service, so I asked for details of the system architecture. The system design followed best practices – the components of the system were being monitored, and all patches were applied in a timely manner. Thus, it was reasonable to expect that the system should have been able to tolerate the load produced by the slowhttptest tool.
Here is the schematic view of the system below:
A round-robin DNS acting as load balancer that resolves the hostname to one of the Squid’s IP addresses
Two firewalled reverse proxy servers (Squid) running on powerful dedicated servers
A cluster of web servers
A cluster of application servers
I ran dig, a DNS lookup utility, to get the list of IP addresses the hostname can be resolved to. After picking one, I added a line to my /etc/hosts mapping hostname to that particular address and concentrated on attacking a single proxy, rather than spreading the load among all of them. I then configured the slowhttptest tool in an aggressive manner to create the “Slow Read” request of a photo of one of company executives on “Management Team” section of the website.
Guess what? I got a true denial-of-service by rendering the front-end Squid machine inoperable. This effectively meant that I knocked out tens of backend HTTP servers from a single laptop. After further investigation, the DoS turned out to be a simple misconfiguration. Since the operating system was configured to limit a process to 1024 open file descriptors, the Squid simply ran out of the file descriptors.
It was assumed that if a server can handle 60,000 connections per minute, it’s good to go. However, the aspect of connection duration was overlooked, and the system was able to accept, serve and close the connection so fast that the concurrent connections pool was never filled up. Thus, system thresholds were never reached.
The moral of the story, aside from “Good beats Evil,” is that the security level of a system is defined by the weakest link in the chain. This is another example of security being more than just installing patches. In this particular case, the mistake could have been prevented with proper penetration and stress testing. You should not rely on “secure” architecture, and you should test for potential problems like this slow read DoS.
Imagine a line at a fast food restaurant that serves two types of burgers, and a customer at the cashier is stuck for a while deciding what he wants to order, making the rest of the line anxious, slowing down the business. Now imagine a line at the same restaurant, but with a sign saying “think ahead of your order,” which is supposed to speed things up. But now the customer orders hundreds of burgers, pays, and the line is stuck again, because he can take only 5 burgers at time to his car, making signs ineffective.
While developing the slowhttptest tool, I thought about this burger scenario, and became curious about how HTTP servers react to slow consumption of their responses. There are so many conversations about slowing down requests, but none of them cover slow responses. After spending a couple of evenings implementing proof-of-concept code, I pointed it to my so-many-times-tortured Apache server and, surprisingly, got a denial of service as easily as I got it with slowloris and slow POST.
Let me remind you what slowloris and slow POST are aiming to do: A Web server keeps its active connections in a relatively small concurrent connection pool, and the above-mentioned attacks try to tie up all the connections in that pool with slow requests, thus causing the server to reject legitimate requests, as in first reastaurnt scenario.
The idea of the attack I implemented is pretty simple: Bypass policies that filter slow-deciding customers, send a legitimate HTTP request and read the response slowly, aiming to keep as many connections as possible active. Sounds too easy to be true, right?
Crafting a Slow Read
Let’s start with a simple case, and send a legitimate HTTP request for a resource without reading the server’s response from the kernel receive buffer.
We craft a request like the following:
GET /img/delivery.png HTTP/1.1
User-Agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.0; U; Edition MacAppStore; en) Presto/2.9.168 Version/11.50
And the server replies with something like this:
HTTP/1.1 200 OK
Date: Mon, 19 Dec 2011 00:12:28 GMT
Last-Modified: Thu, 08 Dec 2011 15:29:54 GMT
For those who don’t feel like reading tcpdump’s output: We established a connection; sent the request; received the response through several TCP packets sized 1448 bytes because of Maximum Segment Size that the underlying communication channel supports; and finally, 5 seconds later, we received the TCP packet with the FIN flag.
Everything seems normal and expected. The server handed the data to its kernel level send buffer, and the TCP/IP stack took care of the rest. At the client, even while the application had not read yet from its kernel level receive buffer, all the transactions were completed on the network layer.
What if we try to make the client’s receive buffer very small?
We sent the same HTTP request and server produced the same HTTP response, but tcpdump produced much more interesting results:
In the initial SYN packet, the client advertised its receive window size as 28 bytes. The server sends the first 28 bytes to the client and that’s it! The server keeps polling the client for space available at progressive intervals until it reaches a 2-minute interval, and then keeps polling at that interval, but keeps receiving win 0.
This is already promising: if we can prolong the connection lifetime for several minutes, it’s not that bad. And we can have more fun with thousands of connections! But fun did not happen. Let’s see why: Once the server received the request and generated the response, it sends the data to the socket, which is supposed to deliver it to the end user. If the data can fit into the server socket’s send buffer, the server hands the entire data to the kernel and forgets about it. That’s what happened with our last test.
What if we make the server keep polling the socket for write readiness? We get exactly what we wanted: Denial of Service.
Let’s summarize the prerequisites for the DoS:
We need to know the server’s send buffer size and then define a smaller-sized client receive buffer. TCP doesn’t advertise the server’s send buffer size, but we can assume that it is the default value, which is usually between 65Kb and 128Kb. There’s normally no need to have a send buffer larger than that.
We need to make the server generate a response that is larger than the send buffer. With reports indicating the Average Web Page Approaches 1MB, that should be fairly easy. Load the main page of the victim’s Web site in your favorite WebKit-based browser like Chrome or Safari and pick the largest resource in Web Inspector.
If there are no sufficiently large resources on the server, but it supports HTTP pipelining, which many Web servers do, then we can multiply the size of the response to fill up the server’s send buffer as much as we need by re-requesting same resource several times using the same connection.
For example, here’s a screenshot of mod_status on Apache under attack:
As you can see, all connections are in the WRITE state with 0 idle workers.
Here is the chart generated by the latest release of slowhttptest with Slow Read attack support:
Even though the TCP/IP stack shouldn’t make decisions on resetting alive and responsive connections, and it is the “userland” application’s responsibility to do so, I assume that some TCP/IP implementations or firewalls might have timers to track connections that cannot send data for some time. To avoid triggering such decisions, slowhttptest can read data from the local receive buffer very slowly to make the TCP/IP stack reply with ACKs with window size other than 0, thus ensuring some physical data flow from server to client.
While I was implementing the attack, I contacted Ivan Ristic to get his opinion and suggestions. I was suspecting there would be attacks exploiting zero/small window, but I thought I am the first one to apply it to tie up an HTTP server. I was surprised when Ivan replied with links to sockstress by Outpost24 and Nkiller2 exploiting Persist Timer infiniteness that are already covering most aspects I wanted to describe. However, the above mentioned techniques are crafting TCP packets and use raw sockets, whereas slowhttptest uses only the TCP sockets API to achieve almost the same functionality.
We still think it’s worthwhile to have a configurable tool to help people focus and design defense mechanisms, since this vulnerability still exists on many systems three years after it was first discovered, and I consider Slow Read DoS attacks are even lower profile and harder to detect than slowloris and slow POST attacks.
All servers I observed (Apache, nginx, lighttpd, IIS 7.5) are vulnerable in their default configuration.
The fundamental problem here is how servers are handling write readiness for active sockets.
The best protection would be:
Do not accept connections with abnormally small advertised window sizes
Do not enable persistent connections and HTTP pipelining unless performance really benefits from it
Limit the absolute connection lifetime to some reasonable value
Some servers have built-in protection, which is turned off by default. For example, lighttpd has the server.max-write-idle option to specify maximum number of seconds until a waiting write call times out and closes the connection.
Apache is vulnerable in its default configuration, but MPM Event, for example, handles slow requests and responses significantly better than other modules, but falls back to worker MPM behavior for SSL connections. ModSecurity supports attributes to control how long a socket can remain in read or write state.
There is a handy script called “Flying frog” available, written by Christian Folini, an expert in Application Layer DoS attacks detection. Flying frog is a monitoring agent that hovers over the incoming traffic and the application log. It picks individual attackers, like a frog eats a mosquito.