Welcome to a new series of resources to accompany the release of our Web Application Scanning (WAS) 3.0 solution. We’ll cover common questions about the solution and, more importantly, we’ll show how scanning ties into the broader challenge of securing web apps. Web apps are usually the most exposed part of an organization’s network. They serve as everything from a fancy brochure with static data, to interactive services where we connect with others, to apps that collect personal data, to apps that generate millions of dollars in revenue. All of these apps carry risk for an organization. Not only is it important to know how secure an app is, it’s important to know where all of your apps are in the first place.
And there’s a huge number of web sites to secure. Netcraft’s May 2013 internet survey shows that we’re getting close to the 700 million mark of active web sites. Sure, a lot of those sites have spam links, phishing, malware, and stale content; but far more have apps that need to be secure in order to protect our data, our money, and their owner’s networks.
In the last 10 years hackers have routinely taken advantage of web application vulnerabilities to successfully breach organizations. During that time there has also been an increase in the use of malware hosted on websites to infect unsuspecting users. Many times a vulnerability in a website is exploited to setup the malware that a legitimate website may then deliver. The relationship between website vulnerabilities and malware is one that is likely to continue to increase. The Verizon 2012 Data Breach Investigation Report notes: “Out of the 855 incidents this year, 81% leveraged hacking, 69% included malware, and an impressive 61% of all breaches featured a combination of hacking techniques and malware.“ The combination of threats presents organizations with a new challenge when trying to identify both web application vulnerabilities and malware that may exist on their sites.
WAS 3.0 Now Includes Malware Detection
In announcing Qualys WAS 3.0, organizations now have away to address the need to detect not only web application vulnerabilities, but also malware that may infect these same web properties. Qualys has offered the Malware Detection Service as both a freemium solution for individuals and as an Enterprise Edition for larger organizations that need to protect a large number of websites. With WAS 3.0, Qualys has integrated this malware detection capability into the WAS solution to make it easy for organizations to configure both types of scans from WAS. Although the services are integrated, the way the scanning is executed and the threats that are identified by vulnerability scanning and malware detection are very different. Now that we’ve outlined the integration at a high level, lets go into a little more detail about how each type of scanning is performed, and how they are integrated in WAS 3.0.
Malware Detection Dashboard
Advanced Detection Methods
Qualys WAS works by interacting with a web application or website just as a user would, but does so in a fully automated way. The WAS scanner will request pages, login and follow links to new pages, posting forms as it goes. While interacting with the site, WAS injects hostile payloads and then observes how the web application responds. If the web application responds in a given way, WAS can determine that the application is vulnerable to the specific attack used. This is the same type of action that a penetration tester or malicious hacker would use to discover these vulnerabilities, but it is done in an automated way that takes a fraction of the time that it does for a human. This makes the automated scanning far more cost effective.
The Malware Detection Service (MDS) takes a different approach to detection. While MDS also automatically interacts with the site as a user would, navigating and following links, it uses a much different technique to detect malware. MDS primarily uses Behavioral Analysis to identify when a website is infected with malware. MDS actually interacts with websites using an instrumented vulnerable browser hosted on a virtual machine. The virtual machine is created on the fly when the scan is requested. The service then navigates the site automatically, and by observing the behavior of the system the web browser is hosted on via the instrumentation, the service can detect when activities associated with malware take place. So instead of looking for something specific in the content itself, it is instead looking for what the browser and host do when pages are loaded. Behavioral analysis is superior in many ways to traditional malware detection methods because it can identify even zero-day malware that has not yet been analyzed to create a detection signature.
Malware Scan Report
Look for Malware Close to Home
Now that we have discussed how each service works and the differences in their detection techniques, we can discuss some other differences in web application vulnerability testing and malware detection. While web application vulnerabilities can be expected in both internal and external web applications, website malware is typically found only on Internet facing web applications. This is because malware distributors want to infect as many users as possible, so hosting malware on sites that have the most exposure is the most effective approach. Malware is also typically not likely to be distributed evenly on websites. Unlike vulnerabilities, which are usually distributed throughout a web application because they are unintentional, malware is usually found on pages that are closer to the home page for the site. This is also to take advantage of the traffic patterns of users that will typically be heavier in the pages that are the fewest clicks away from the default for the site. So while WAS will scan a large number of pages to ensure it is testing all the relevant functions thatmay be vulnerable, MDS will seek to test just the unauthenticated pages that are closest to the default page for a site, where malware is most likely to be found.
Another difference between the services that is important to note is the impact. While WAS is actively testing websites by injecting payloads and therefore usually requires advanced notice or scheduling in a maintenance window for production sites, the MDS service is purely observational and does not have any more impact on a site than a normal user would. This allows MDS to be run more often and without the notification that is typically required for running a vulnerability scan.
Malware Threat Detail
To bring these two services together, Qualys has added the ability to configure malware scanning for external web applications that are licensed in Qualys WAS. When an Internet accessible web application is being setup in WAS, the user will now have the option to indicate that they would like it to be scanned for malware. This will setup a daily scheduled malware scan for the site. If any malware is discovered the MDS service notifies the subscription owner with an email outlining the issue. Users can login to Qualys to see more details if needed. So now organizations have an easy way to combine Web Application Scanning and Malware Detection to ensure that their Internet facing websites are free from web application vulnerabilities and malware.
Burp Suite Professional Integration
Last but not least, Qualys WAS 3.0 also introduces an integration with an attack proxy tool used primarily to conduct targeted, manual application penetration and validation testing. While WAS is designed to be fully automated and scalable and is appropriate for the security testing requirements of the majority of applications, there are some web applications that require higher levels of assurance. These applications typically require both automated scanning as well as manual penetration testing activities. In most cases, web application penetration testing is primarily conducted with the use of attack proxy tools. Attack proxies offer a high level of control for skilled users and help to facilitate deep testing when it is required. Recognizing that there is a place for both automated scanning and manual testing, Qualys is moving to combine the best of both approaches by integrating attack proxy tools, enabling customers to benefit from highly automated and scalable scanning while at the same time having access to manual tools when additional exploitation or risk evaluation is required.
Qualys WAS 3.0 takes the first step in this evolution by integrating the scan results from Burp Suite Professional (BSP). BSP is a tool that combines interactive testing capabilities with scanning. Testers who use BSP can scan individual pages as they navigate a web application and discover vulnerabilities as they do so. But BSP is primarily intended for use by a single user. There is no centralized storage for results that would allow them to be shared by multiple users, or to be tracked and trended over time. Qualys seeks to overcome this limitation by providing a way to import the BSP scanner findings. This gives organizations a way to store the findings discovered in the BSP scanner with those discovered by Qualys WAS.
UPDATE: Qualys has released a Burp extension for WAS to make the integration even more seamless and easy to use.
As you can tell, Qualys WAS 3.0 is a major release with a lot of new capabilities that will help organizations better combat the growing threats to their web applications. We’re excited by the new WAS 3.0 and we look forward to getting it into your hands.
On May 26th 2011, a new EU directive was adopted that requires web sites to gain consent from visitors before they can store cookies or other information used to track a user’s actions. While the EU Cookie legislation went into effect last year, the UK’s Information Commissioner’s Office (ICO) set May 26th of 2012 as the enforcement date. The ICO is the body responsible for enforcing the UK regulation, with authority to levy fines on web site owners up to £500,000.
A Better Way to Identify Cookies
In order to comply with the EU regulations and avoid the UK ICO fines, organizations need to understand what cookies their web sites are issuing and the conditions in which they are issued. Most web application scanning solutions will report the cookies that a web site is issuing. This includes cookies that may be issued by 3rd party sites that have embedded content commonly used to track users for marketing purposes. QualysGuard WAS has provided this information for some time via the Information Gathered (IG) QID 150028 (Cookies Collected). However, the way that the cookies are collected for QID 150028 as well as the way other web application scanners gather cookies may lead to the inclusion of cookies that were issued after the scanner automatically triggered the explicit user consent action. This is because web application scanners typically follow all links, including those that are most commonly used to obtain user consent. What organizations that wish to to gain a user’s explicit consent really need is a way to identify only the cookies that are issued without automatically issuing any user consent actions. In order to address this use case, QualysGuard WAS has implemented a new test (QID 150099), which avoids the most common user consent techniques while gathering cookies from the web site. In doing so, QualysGuard WAS provides organizations with information about the cookies they are issuing without the user’s consent.
Steps to Identify Cookies Issued Without User Consent
The best thing about the new test is that WAS includes it as a standard check during all scans run on or after 26 May, 2012. So organizations using WAS do not need to alter their scan configuration to take advantage of the new test.
To view the cookies issued without user consent in QualysGuard WAS, follow these steps:
1. Log into QualysGuard WAS v2 and navigate to the “Scans” tab.
2. Use the filter panel date selection on the lower left to filter scans to those run on or after 5/26/2012
3. Select the scan and using the quick action menu, choose “View Report”
4. Once the report is open, select the “Results” tab
5. In the filter panel on the left,check the box next to the ‘1’ in the “Information Gathered Levels” section
6. You should see a number of results that are level 1 Information Gathered (IG) items – click on the one with QID 150099. This will show you the cookies that were identified as being issued without any user consent.
Compliance with the Regulation
If you identify any cookies that are subject to the regulation that require user consent – the next step is to set up specific user consent prompts within the web site such that the user can make an informed decision whether to accept or reject the cookies that are being set. The implementation of this will usually take the form of a pop up dialog or prompt, but the actual implementation details will vary based on the organization.
Merchants are getting ready for the upcoming changes to the internal scanning requirements for PCI compliance. This blog post provides a checklist on what you should have ready and will review some of the tools Qualys provides for these requirements.
There are four core areas to focus on in preparation for your compliance to PCI 11.2, taking into account the changes from PCI 6.2 regarding risk ranking of vulnerabilities.
Your documented PCI scope (cardholder dataenvironment)
Your documented risk ranking process
Your scanning tools
Your scan reports
Merchants will need to complete each of these elements to be prepared to pass PCI compliance.
1. Your documented PCI scope (cardholder data environment)
All PCI requirements revolve around a cross-section of assets in your IT infrastructure that is directly involved in storage, processing, or transmitting payment card information. These IT assets are known as the cardholder data environment (CDE), and are the focus areas of the PCI DSS requirements.
These assets can exist in internal or external (public) networks and may be subject to different requirements based on what role they play in payment processing. These assets can be servers, routers, switches, workstations, databases, virtual machines or web applications; PCI refers to these assets as system components.
QualysGuard provides a capability to tag assets under management. The screenshot below shows an example of PCI scope being defined within the QualysGuard Asset Tagging module. It provides the ability to group internal assets (for 11.2.1), external assets (for 11.2.2), and both internal and external assets together (for 11.2.3).
This allows you to maintain documentation of your CDE directly, and to drive your scanning directly from your scope definition.
2. Your documented risk ranking process
This is the primary requirement associated with the June 30th deadline; this is the reference that should allow someone to reproduce your risk rankings for specific vulnerabilities.
The requirement references industry best practices, among other details, to consider in developing your risk ranking. It may help you to quickly adopt a common industry best practice and adapt it to your own environment. Two examples are the Qualys severity rating system, which is the default rating as per the security research team at Qualys; or, the PCI ASV Program Guide, which includes a rating system used by scanning vendors to complete external scanning. QualysGuard is used by 50 of the Forbes Global 100, and spans all market verticals; it qualifies as an industry best practice. Additionally, the QualysGuard platform is used by the majority of PCI Approved Scanning Vendors and already delivers rankings within the PCI ASV Program Guide practices.
The core rules of your risk rankings should take into account CVSS Base Scores, available from nearly all security intelligence feeds. These scores are also the base system used within the PCI ASV Program Guide. Your process should also account for system components in your cardholder data environment and vendor-provided criticality rankings, such as the Microsoft patch ranking system if your CDE includes Windows-based system components.
The process should include documentation that details the sources of security information you follow, how frequently you review the feeds, and how you respond to new information in the feeds. QualysGuard provides daily updates to the vulnerability knowledgebase and now offers a Zero-Day Analyzer service, which leverages data from the iDefense security intelligence feed.
3. Your scanning tools
After you have your scope clearly defined and you have your process for ranking vulnerabilities documented, you will need to be able to run vulnerability scans. This includes internal VM scans, external VM scans, PCI ASV scans (external), internal web application scans and external web application scans. It is thefindings in these scans that will map against your risk ranking process and allow you to produce the necessary scan reports.
You will need to be able to configure your scanning tools to check for “high” vulnerabilities, which will allow you to allocate resources to fix and resolve these issues as part of the normal vulnerability management program and workflow within your environment.
QualysGuard VM, QualysGuard WAS and QualysGuard PCI all work together seamlessly to provide each of these scans capabilities against the same group of assets that represent your PCI scope or CDE.
4. Your scan reports
You will want to produce reports for your internal PCI scope, as defined in #1 of this checklist, both quarterly and after any significant changes. If you have regular releases or updates to your IT infrastructure, you will want to have scan reports from those updates and upgrades. Quarterly scan reports need to be spaced apart by 90 days. In all cases, these reports need to show that there are no “high” vulnerabilities detected by your scanning tools.
Each report for the significant change events will also need to include external PCI scope. QualysGuard VM makes it easy to include both internal and external assets in the same report. QualysGuard VM also provides a direct link to your QualysGuard PCI merchant account for automation of your PCI ASV scan requirements.
QualysGuard WAS allows you to quickly meet your production web application scanning requirement (PCI 6.6) as well as internal web application scanning as part of your software development lifecycle (SDLC), by scanning your applications in development and in test.
If you follow these guidelines you will be well prepared to perform and maintain the required controls for PCI 11.2.
Imagine a line at a fast food restaurant that serves two types of burgers, and a customer at the cashier is stuck for a while deciding what he wants to order, making the rest of the line anxious, slowing down the business. Now imagine a line at the same restaurant, but with a sign saying “think ahead of your order,” which is supposed to speed things up. But now the customer orders hundreds of burgers, pays, and the line is stuck again, because he can take only 5 burgers at time to his car, making signs ineffective.
While developing the slowhttptest tool, I thought about this burger scenario, and became curious about how HTTP servers react to slow consumption of their responses. There are so many conversations about slowing down requests, but none of them cover slow responses. After spending a couple of evenings implementing proof-of-concept code, I pointed it to my so-many-times-tortured Apache server and, surprisingly, got a denial of service as easily as I got it with slowloris and slow POST.
Let me remind you what slowloris and slow POST are aiming to do: A Web server keeps its active connections in a relatively small concurrent connection pool, and the above-mentioned attacks try to tie up all the connections in that pool with slow requests, thus causing the server to reject legitimate requests, as in first reastaurnt scenario.
The idea of the attack I implemented is pretty simple: Bypass policies that filter slow-deciding customers, send a legitimate HTTP request and read the response slowly, aiming to keep as many connections as possible active. Sounds too easy to be true, right?
Crafting a Slow Read
Let’s start with a simple case, and send a legitimate HTTP request for a resource without reading the server’s response from the kernel receive buffer.
We craft a request like the following:
GET /img/delivery.png HTTP/1.1
User-Agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.0; U; Edition MacAppStore; en) Presto/2.9.168 Version/11.50
And the server replies with something like this:
HTTP/1.1 200 OK
Date: Mon, 19 Dec 2011 00:12:28 GMT
Last-Modified: Thu, 08 Dec 2011 15:29:54 GMT
For those who don’t feel like reading tcpdump’s output: We established a connection; sent the request; received the response through several TCP packets sized 1448 bytes because of Maximum Segment Size that the underlying communication channel supports; and finally, 5 seconds later, we received the TCP packet with the FIN flag.
Everything seems normal and expected. The server handed the data to its kernel level send buffer, and the TCP/IP stack took care of the rest. At the client, even while the application had not read yet from its kernel level receive buffer, all the transactions were completed on the network layer.
What if we try to make the client’s receive buffer very small?
We sent the same HTTP request and server produced the same HTTP response, but tcpdump produced much more interesting results:
In the initial SYN packet, the client advertised its receive window size as 28 bytes. The server sends the first 28 bytes to the client and that’s it! The server keeps polling the client for space available at progressive intervals until it reaches a 2-minute interval, and then keeps polling at that interval, but keeps receiving win 0.
This is already promising: if we can prolong the connection lifetime for several minutes, it’s not that bad. And we can have more fun with thousands of connections! But fun did not happen. Let’s see why: Once the server received the request and generated the response, it sends the data to the socket, which is supposed to deliver it to the end user. If the data can fit into the server socket’s send buffer, the server hands the entire data to the kernel and forgets about it. That’s what happened with our last test.
What if we make the server keep polling the socket for write readiness? We get exactly what we wanted: Denial of Service.
Let’s summarize the prerequisites for the DoS:
We need to know the server’s send buffer size and then define a smaller-sized client receive buffer. TCP doesn’t advertise the server’s send buffer size, but we can assume that it is the default value, which is usually between 65Kb and 128Kb. There’s normally no need to have a send buffer larger than that.
We need to make the server generate a response that is larger than the send buffer. With reports indicating the Average Web Page Approaches 1MB, that should be fairly easy. Load the main page of the victim’s Web site in your favorite WebKit-based browser like Chrome or Safari and pick the largest resource in Web Inspector.
If there are no sufficiently large resources on the server, but it supports HTTP pipelining, which many Web servers do, then we can multiply the size of the response to fill up the server’s send buffer as much as we need by re-requesting same resource several times using the same connection.
For example, here’s a screenshot of mod_status on Apache under attack:
As you can see, all connections are in the WRITE state with 0 idle workers.
Here is the chart generated by the latest release of slowhttptest with Slow Read attack support:
Even though the TCP/IP stack shouldn’t make decisions on resetting alive and responsive connections, and it is the “userland” application’s responsibility to do so, I assume that some TCP/IP implementations or firewalls might have timers to track connections that cannot send data for some time. To avoid triggering such decisions, slowhttptest can read data from the local receive buffer very slowly to make the TCP/IP stack reply with ACKs with window size other than 0, thus ensuring some physical data flow from server to client.
While I was implementing the attack, I contacted Ivan Ristic to get his opinion and suggestions. I was suspecting there would be attacks exploiting zero/small window, but I thought I am the first one to apply it to tie up an HTTP server. I was surprised when Ivan replied with links to sockstress by Outpost24 and Nkiller2 exploiting Persist Timer infiniteness that are already covering most aspects I wanted to describe. However, the above mentioned techniques are crafting TCP packets and use raw sockets, whereas slowhttptest uses only the TCP sockets API to achieve almost the same functionality.
We still think it’s worthwhile to have a configurable tool to help people focus and design defense mechanisms, since this vulnerability still exists on many systems three years after it was first discovered, and I consider Slow Read DoS attacks are even lower profile and harder to detect than slowloris and slow POST attacks.
All servers I observed (Apache, nginx, lighttpd, IIS 7.5) are vulnerable in their default configuration.
The fundamental problem here is how servers are handling write readiness for active sockets.
The best protection would be:
Do not accept connections with abnormally small advertised window sizes
Do not enable persistent connections and HTTP pipelining unless performance really benefits from it
Limit the absolute connection lifetime to some reasonable value
Some servers have built-in protection, which is turned off by default. For example, lighttpd has the server.max-write-idle option to specify maximum number of seconds until a waiting write call times out and closes the connection.
Apache is vulnerable in its default configuration, but MPM Event, for example, handles slow requests and responses significantly better than other modules, but falls back to worker MPM behavior for SSL connections. ModSecurity supports attributes to control how long a socket can remain in read or write state.
There is a handy script called “Flying frog” available, written by Christian Folini, an expert in Application Layer DoS attacks detection. Flying frog is a monitoring agent that hovers over the incoming traffic and the application log. It picks individual attackers, like a frog eats a mosquito.
Slow HTTP attacks are denial-of-service (DoS) attacks in which the attacker sends HTTP requests in pieces slowly, one at a time to a Web server. If an HTTP request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data. When the server’s concurrent connection pool reaches its maximum, this creates a DoS. Slow HTTP attacks are easy to execute because they require only minimal resources from the attacker.
In this article, I describe several simple steps to protect against slow HTTP attacks and to make the attacks more difficult to execute.
To protect your Web server against slow HTTP attacks, I recommend the following:
Reject / drop connections with HTTP methods (verbs) not supported by the URL.
Limit the header and message body to a minimal reasonable length. Set tighter URL-specific limits as appropriate for every resource that accepts a message body.
Set an absolute connection timeout, if possible. Of course, if the timeout is too short, you risk dropping legitimate slow connections; and if it’s too long, you don’t get any protection from attacks. I recommend a timeout value based on your connection length statistics, e.g. a timeout slightly greater than median lifetime of connections should satisfy most of the legitimate clients.
The backlog of pending connections allows the server to hold connections it’s not ready to accept, and this allows it to withstand a larger slow HTTP attack, as well as gives legitimate users a chance to be served under high load. However, a large backlog also prolongs the attack, since it backlogs all connection requests regardless of whether they’re legitimate. If the server supports a backlog, I recommend making it reasonably large to so your HTTP server can handle a small attack.
Define the minimum incoming data rate, and drop connections that are slower than that rate. Care must be taken not to set the minimum too low, or you risk dropping legitimate connections.
Applying the above steps to the HTTP servers tested in the previous article indicates the following server-specific settings:
Using the <Limit> and <LimitExcept> directives to drop requests with methods not supported by the URL alone won’t help, because Apache waits for the entire request to complete before applying these directives. Therefore, use these parameters in conjunction with the LimitRequestFields, LimitRequestFieldSize, LimitRequestBody, LimitRequestLine, LimitXMLRequestBody directives as appropriate. For example, it is unlikely that your web app requires an 8190 byte header, or an unlimited body size, or 100 headers per request, as most default configurations have.
Set reasonable TimeOut and KeepAliveTimeOut directive values. The default value of 300 seconds for TimeOut is overkill for most situations.
ListenBackLog’s default value of 511 could be increased, which is helpful when the server can’t accept connections fast enough.
Increase the MaxRequestWorkers directive to allow the server to handle the maximum number of simultaneous connections.
Adjust the AcceptFilter directive, which is supported on FreeBSD and Linux, and enables operating system specific optimizations for a listening socket by protocol type. For example, the httpready Accept Filter buffers entire HTTP requests at the kernel level.
A number of Apache modules are available to minimize the threat of slow HTTP attacks. For example, mod_reqtimeout’s RequestReadTimeout directive helps to control slow connections by setting timeout and minimum data rate for receiving requests.
I also recommend switching apache2 to experimental Event MPM mode where available. This uses a dedicated thread to handle the listening sockets and all sockets that are in a Keep Alive state, which means incomplete connections use fewer resources while being polled.
Limit request attributes is through the <RequestLimits> element, specifically the maxAllowedContentLength, maxQueryString, and maxUrl attributes.
Set <headerLimits> to configure the type and size of header your web server will accept.
Tune the connectionTimeout, headerWaitTimeout, and minBytesPerSecond attributes of the <limits> and <WebLimits> elements to minimize the impact of slow HTTP attacks.
The above are the simplest and most generic countermeasures to minimize the threat. Tuning the Web server configuration is effective to an extent, although there is always a tradeoff between limiting slow HTTP attacks and dropping legitimately slow requests. This means you can never prevent attacks simply using the above techniques.
Beyond configuring the web server, it’s possible to implement other layers of protection like event-driven software load balancers, hardware load balancers to perform delayed binding, and intrusion detection/prevention systems to drop connections with suspicious patterns.
However, today, it probably makes more sense to defend against specific tools rather than slow HTTP attacks in general. Tools have weaknesses that can be identified and and exploited when tailoring your protection. For example, slowhttptest doesn’t change the user-agent string once the test has begun, and it requests the same URL in every HTTP request. If a web server receives thousands of connections from the same IP with the same user-agent requesting the same resource within short period of time, it obviously hints that something is not legitimate. These kinds of patterns can be gleaned from the log files, therefore monitoring log files to detect the attack still remains the most effective countermeasure.
Following the release of the slowhttptest tool, I ran benchmark tests of some popular Web servers. My testing shows that all of the observed Web servers (and probably others) are vulnerable to slow http attacks in their default configurations. Reports generated by the slowhttptest tool illustrate the differences in how the various Web servers handle slow http attacks.
Tests were run against the default, out-of-the-box configurations of the Web servers, which is the best level playing field for comparison. And while most deployments will customize their configuration, they will likely do it for reasons other than improving protection against slow http attacks. Therefore it’s useful to understand how the default configurations relate to slow http attack vulnerability.
In addition to noting that all Web servers tested are vulnerable to slow http attacks, I drew some other generalizations about how different Web servers handle slow http attacks:
Apache, nginx and lighttpd wait for complete headers on any URL for requests without a message body (GET, for example) before issuing a response, even for requests with the verb FAKESLOWVERB.
Apache also waits for the entire body of requests with fake verbs before issuing a response with an error message.
For Apache, nginx and lighttpd, slow requests sent with fake verbs consume resources with the same success rate as requests sent with valid verbs, so the hacker doesn’t even need to bother finding a vulnerable URL.
Server administrators’ scripts typically query for particular expected values like method, or URL, or referer header, etc., but not for fake verbs. That means it is likely that slow http attacks using fake verbs or URLs can go unnoticed by the server administrator.
Each server except IIS is vulnerable to both slow header and slow message body attacks. IIS is vulnerable only to slow message body attacks.
However, there are some interesting differences in the results as well. The screenshots below, which show the graphical output of the slowhttptest tool, demonstrate how connection state changed during the tests, and illustrate how the various Web servers handle slow http attacks.
Apache MPM prefork:
Apache is generally the most vulnerable, and denial of service can be achieved with 355 connections on the system tested. Apache documentation indicates a 300-second timeout for connections, but my testing indicates this is not enforced.
In the test, the server accepted 483 connections and started processing 355 of them. The 355 corresponds to RLIMIT_NPROC (max user processes), a machine-dependent value that is 709 on the machine tested, times MaxClients, whose default value in httpd.conf is 50%: 355 = 709 * 50%.
The rest of the connections were accepted and backlogged. The limit for backlog is set by the ListenBackLog directive and is 512 in default httpd.conf, but is often limited to a smaller number by operating system, 128 in case of Mac OS X. The 483 connected connections shown in the graph correspond to the 355 being processed plus the 128 in the backlog.
One interesting side note: The Apache documentation says that ListenBackLog is the “maximum length of the queue of pending connections”, but a simple test shows that backlogged connections are being accepted, e.g. server sends SYN-ACK back, and backlogged connections are ready for write operations on the client side, which is not a normal behavior for pending connections. Apache documentation is probably using “pending” to mean the internal connection state in Apache, but I rather expect “pending” to refer to the state in the TCP stack, where “pending” means “waiting to be accepted”.
I terminated the test after 240 seconds, but the picture would be the same for a longer test. A properly configured client can keep connections open for hours, until the limit for headers count or length is met. With such settings, DoS is achieved with N+1 connections, where N is number specified by MaxClients.
While testing, I noticed that my httpd.conf had a TimeOut directive set to 300, and Apache 2.0 documentation says that:
The TimeOut directive currently defines the amount of time Apache will wait for three things:
The total amount of time it takes to receive a GET request.
The amount of time between receipt of TCP packets on a POST or PUT request.
The amount of time between ACKs on transmissions of TCP packets in responses.
I understand from the above paragraph that the opened connections should be closed after the TimeOut interval, i.e. 300 seconds in the default configuration. However, I observed that the connection is closed only if there is no data arriving for 300 seconds, which means this is not an effective preventative measure against DoS. This behavior also indicates that if there are some rules defined to throttle down too many connections with a similar pattern in a shorter period of time, they are also ineffective: An attacker can initiate connections with very low connection rate and get the same results, as the connection can be prolonged virtually forever.
nginx is also vulnerable to slow http attacks, but it offers more controls than Apache.
Surprisingly, the number of initially accepted connections was 377, even with the default settings of worker_connections = 1024, and worker_processes = 1. But worker_rlimit_nofile, the maximum number of open file descriptors per worker that governs the maximum number of connections the worker can accept, has a default value of 377 on Mac OS X.
As shown in the graph above, the server accepts the connections it can accept, and leaves the rest of the connections pending. Due to some hardcoded timeout values, connections are closed after 70 seconds no matter how slow the data is arriving. Nginx is therefore safer than default Apache, but it still gives attacker a chance to achieve DoS for 70 seconds.
The default TCP timeout (75), which closes pending connections, is longer than the nginx timeout (65), which closes accepted connections. This means that nginx moves some pending connections to the accepted state after it times out the first set of accepted connections. This extends the length of time that a batch of slow connection requests can tie up the server.
In any case, a client can always re-establish connections every 65 seconds to keep the server under DoS conditions.
Lighttpd with default configuration is vulnerable to both http attacks, which are fairly easy to carry out.
The default configuration allows a maximum of 480 connections to be accepted, as can be seen in the graph above, with the rest pending for 200 seconds, and then closed by a timeout. Lighttpd has a useful attribute called server.max-read-idle with default value 60, which closes a connection if no data is received before the timeout interval, but sending something to the socket every 59 seconds would reset it, allowing the attacker to keep connections open for a long time.
The lighttpd forums indicate that a fixed issue protects against slow HTTP request handling, it only fixes a waste of memory issue, and hitting the limit of concurrently processing connections is still pretty easy.
IIS 7 offers good protection against slow headers, but this protection does not extend to slow message bodies. Because IIS is architected differently from the other Web servers tested, the behavior it displays is also different.
IIS 7 accepts all connections, but does not consider them writeable until headers sections are received in full. Such connections require fewer resources from IIS, and therefore IIS can maintain a relatively larger pool of these connections. In the default configuration, it is not possible to exhaust the pool with a single slowhttptest run, which is limited to 1024 connections on systems I tested on. However, it would be possible to launch multiple instances of slowhttptest to get around this limitation.
For requests with a slow message body, IIS’ protection is useless, as it’s possible to send complete headers sections but then slow down the message body section. In this case, the connections are transferred to IIS’ internal processing queue, which is limited in size to, don’t be surprised, 100 connections. Even though the screenshot shows 1000 connections, I experimentally figured out that 100 requests with slow message body are enough to get DoS.
Software configuration is all about tradeoffs, and it is normal to sacrifice one aspect for another. We see from the test results above that all default configuration files of the Web servers tested are sacrificing protection against slow HTTP DoS attacks in exchange for better handling of connections that are legitimately slow.
Because a lot of people are not aware of slow http attacks, they will tend to trust the default configuration files distributed with the Web servers. It would be great if the vendors creating distribution packages for Web servers would pay attention to handling and minimizing the impact of slow attacks, as much as the Web servers’ configuration allows it. Meanwhile, if you are running a Web server, be careful and always test your setup before relying on it for production use.
Slow HTTP attacks are denial-of-service (DoS) attacks that rely on the fact that the HTTP protocol, by design, requires a request to be completely received by the server before it is processed. If an HTTP request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data. When the server’s concurrent connection pool reaches its maximum, this creates a denial of service. These attacks are problematic because they are easy to execute, i.e. they can be executed with minimal resources from the attacking machine.
Inspired by Robert “Rsnake” Hansen’s Slowloris and Tom Brennan’s OWASP slow post tools, I started developing another open-source tool, called slowhttptest, available with full documentation at https://github.com/shekyan/slowhttptest. Slowhttptest opens and maintains customizable slow connections to a target server, giving you a picture of the server’s limitations and weaknesses. It includes features of both of the above tools, plus some additional configurable parameters and nicely formatted output.
Slowhttptest is configurable to allow users to test different types of slow http scenarios. Supported features are:
slowing down either the header or the body section of the request
any HTTP verb can be used in the request
configurable Content-Length header
random size of follow-up chunks, limited by optional value
random header names and values
random message body data
configurable interval between follow-up data chunks
support for SSL
support for hosts names resolved to IPv6
verbosity levels in reporting
connection state change tracking
variable connection rate
detailed statistics available in CSV format and as a chart generated as HTML file using Google Chart Tools
How to Use
The tool works out of the box with default parameters, which are harmless and most likely will not cause a denial of service.
and the test begins with the default parameters.
Depending on which test mode you choose, the tool will send either slow headers:
GET / HTTP/1.1CRLF
Host: localhost:80 CRLF
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2)CRLF
. n seconds
. n seconds
. n seconds
or slow message bodies:
POST / HTTP/1.1CRLF
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0.1) Gecko/20100101
. n seconds
. n seconds
Repeated until the server closes the connection or the test hits the specified time limit.
Depending on the verbosity level selected, the slowhttptest tool logs anything from heartbeat messages every 5 seconds to a full traffic dump. Output is available either in CSV format or in HTML for interactive use with Google Chart Tools.
Note: Care should be taken when using this tool to avoid inadvertently causing denial of service against your servers. For production servers, QualysGuard Web Application Scanner will perform passive (non-intrusive) automated tests that will indicate susceptibility to slow http attacks without the risk of causing denial of service.
Example Test and Results
The HTML screenshot below shows the results of running slowhttptest against a test server in a test lab. In this scenario, the tool opens 1000 connections with rate of 200 connections per second, and the server was able to concurrently process only 377 connections, leaving the remaining 617 connections pending. Denial of service was achieved within the first 5 seconds of the test, and lasted 60 seconds, until the server timed out some of the active connections. At this point, the server transferred another set of connections from pending state to active state, thus causing DoS again, until the server timed out those connections.
Figure 1: Sample HTML output of slowhttptest results.
As is shown in the above test, the slowhttptest tool can be used to test a variety of different slow http attacks and to understand the effects they will have on specific server configurations. By having a visual representation of the server’s state, it is easy to understand how the server reacts to slow HTTP requests. It is then possible to adjust server configurations as appropriate. In follow-up posts, I will describe some detailed analysis of different HTTP servers’ behavior on slow attacks and mitigation techniques.
Any comments are highly appreciated, and I will review all feature requests posted on the project page at https://github.com/shekyan/slowhttptest. Many thanks to those who are contributing to this project.
Interest in the QualysGuard Web Application Scanning (WAS) module has been growing since its new UI was demonstrated last week at BlackHat. Along with such interest come questions about how the scanner works. The ultimate goal for WAS is to provide accurate, scalable testing for the most common, highest profile vulnerabilities (think of SQL injection and XSS) so that manual testing can skip the tedious and time-consuming aspects of an app review and focus on complex vulns that require brains rather than RAM.
One complex vuln in particular is CSRF. Automated, universal CSRF detection is a tough challenge, which is why we try to solve the problem in pieces rather than all at once. It’s the type of challenge that keeps web scanning interesting. Here’s a brief look at the approach we’ve taken to start bringing CSRF detection into the realm of automation.
First, the test assumes an authenticated scan. If the scan is not given credentials, then the tests won’t be performed. Also, tests are targeted to specific manifestations of CSRFrather than the broad set of attacks possible from our friendly sleeping giant.
Tests roughly follow these steps. Fundamentally, we’re trying to model an attack rather than make inferences based on pattern matching:
1. Identify forms with a "session context". This is a weaker version of (but hopefully a subset of) a "security context", because lots of times security requires knowledge about the boundaries within an app and the authorized actions of a user. This knowledge is hard to come by automatically. Never the less, some utility can be had by looking at forms with the following attributes:
Only available to an authenticated user.
Are not "trivial" such as search forms or logout buttons.
Have an observable effect, either on the session or the HTTP response. (Hint: Here’s where the automated scan becomes narrow, meaning prone to false negatives.)
2. Set up two separate sessions for the user (i.e. login twice). Keep their cookie jars apart. We’ll refer to the sessions as Aardvark and Bobcat (or A & B or Alpha & Bravo, etc.). Remember, this is for a single user.
3. Obtain a form for session Aardvark.
4. Obtain a form for session Bobcat.
5. Swap the forms between the two sessions and submit. (Crossing the streams, like Egon told you not to do.)
The assumption is that any CSRF tokens in Aardvark’s form are tied to the session cookie(s) used by Aardvark and Bobcat’s belong to Bobcat. Things should blow up if the tokens and session don’t match.
6. Examine the "swapped" responses.
If the response has a clear indication of error, then the app is more likely to be protected from CSRF. The obvious error is something like, "Invalid CSRF token". Sadly, the world is not unicorns and rainbows for automated scanning and errors may not be so obvious or point so directly to CSRF.
If the response is similar to the one received from the original request, then it appears that the form is not coupled to a user’s session. This is an indicator that the form is more probably vulnerable to CSRF.
What it won’t do, because these techniques are noisy and unreliable (as opposed to subtle and quick to anger):
Look for hidden form fields with names or values that match CSRF tokens. If an obvious token is present, that doesn’t mean the app is actually validating it.
Use static inspection of the form, DOM, or HTML to look for any examples of CSRF tokens. Why look for text patterns when you’re trying to determine a behavior? Not everything is solved by regexes. (Which really is unfortunate, by the way.)
Attempt to evaluate the predictability of anything that looks like a CSRF token.
Nor will it demonstrate the compounding factor of CSRF onother vulnerabilities like XSS. That’s something that manual pen-testing should do. In other words, WAS is focused on identifying vulns (it should find an XSS vuln, but it won’t tie the vuln to a CSRF attack to demonstrate a threat). Manual pen-testing more often focuses on how deep an app can be compromised — and the real risks associated with it.
What it’ll miss:
Situations where sessions cookie(s) are static or relatively static for a user. This impairs the "swap" test.
CSRF that can affect unauthenticated users in a meaningful way. This is vague, but as you read more about CSRF you’ll find that some people consider any forgeable action should be considered a vuln. This speaks more to the issue of evaluating risk. You should be relying on people to analyze risk, not tools.
CSRF that affects the user’s privacy. This requires knowledge of the app’s policy and the impact of the attack.
Forms whose effect on a user’s security context manifests in a different response, or in a manner that isn’t immediately evident.
CSRF tokens in the header, which might lead to false positives.
CSRF vulns that manifest via links rather thanforms. Apps put all kinds of functionality in hrefs rather than explicit form tags.
Other situations where we play games of anecdotes and what-ifs.
What we are trying to do:
Reduce noise. Don’t report vulns for the sake of reporting a vuln if no clear security context or actionable data can be provided.
Provide a discussion point so we can explain thebenefits of automated web scanning and point out where manual follow-up will always be necessary.
Learn how real-world web sites implement CSRF in order to find common behaviors that might be detectable via automation. You’d be surprised (maybe) at how often apps have security countermeasures that look nothing like OWASP recommendations and, consequently, fare rather poorly.
Experiment with pushing the bounds of what automation can do, while avoiding hyperbolic claims that automation solves everything.
The current state of CSRF testing in WAS should be relied on as a positive indicator (vuln found, vuln exists) more so than a negative indicator (no vuln found, no vulns exist). That’s supposed to mean that a CSRF vuln reported by WAS should not be a false positive and should be something that the app’s devs need to fix. It also means that if WAS doesn’t find a vuln then the app may still have CSRF vulns. For this particular test a clean report doesn’t mean a clean app; there’re simply too many ways of looking at the CSRF problem to tackle it all at once. We’re trying to break the problem down into manageable parts in order to understand what approaches work. We want to hear your thoughts and feedback on this.
Slow HTTP attacks rely on the fact that the HTTP protocol, by design, requires requests to be completely received by the server before they are processed. If an http request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data. If the server keeps too many resources busy, this creates a denial of service.
These types of attack are easy to execute because a single machine is able to establish thousands of connections to a server and generate thousands of unfinished HTTP requests in a very short period of time using minimal bandwidth.
Due to implementation differences among various HTTP servers, two main attack vectors exist:
Slowloris: Slowing down HTTP headers, making the server wait for the final CRLF, which indicates the end of the headers section;
Slow POST: Slowing down the HTTP message body, making the server wait until all content arrives according to the Content-Length header; or until the final CRLF arrives, if HTTP 1.1 is being used and no Content-Length was declared.
The scary part is that these attacks can just look like requests that are taking a long time, so it’s hard to detect and prevent them by using traditional anti-DoS tools. Recent rumors indicate these attacks are happening right now: CIA.gov attacked using slowloris.
QualysGuard Web Application Scanner (WAS) uses a number of approaches to detect vulnerability to these attacks.
To detect a slow headers (a.k.a. Slowloris) attack vulnerability (Qualys ID 150079), WAS opens two connections to the server and requests the base URL provided in the scan configuration.
The request sent to the first connection consists of a request line and one single header line but without the final CRLF, similar to the following:
GET / HTTP/1.1 CRLF
Connection: keep-alive CRLF
The request sent to the second connection looks identical to the first one, but WAS sends a follow-up header line some interval later to make the HTTP server think the peer is still alive:
Currently that interval is approximately 10 seconds plus the average response time during the crawl phase.
WAS considers the server platform vulnerable to a slowloris attack if the server closes the second connection more than 10 seconds later than the first one. In that case, the server prolonged its internal timeout value because it perceived the connection to be slow. Using a similar approach, an attacker could occupy a resource (thread or socket) on that server for virtually forever by sending a byte per T – 1 (or any random value less than T), where T is the timeout after which the server would drop the connection.
WAS does not report the server to be vulnerable if it keeps both connections open for the same long period of time (more than 2 minutes, for example), as that would be a false positive if the target server were IIS (which has protection against slow header attacks, but is less tolerant of real slow connections).
Slow POST Detection
To detect a slow POST (a.k.a. Are-You-Dead-Yet) attack vulnerability (QID 150085), WAS opens two other connections, and uses an action URL of a form it discovered during the crawl phase that doesn’t require authentication.
The request sent to the first connection looks like the following:
POST /url_that_accepts_post HTTP/1.1 CRLF
Host: host_to_test:port_if_not_default CRLF
User-Agent: Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 6.0;) CRLF
Similar to the slow headers approach, WAS sends an identical request to the second connection, and then 10 seconds later sends the following (again without the final CRLF):
WAS considers the target vulnerable if any of the following conditions are met:
The server keeps the second connection open 10 seconds longer than the first one, or
The server keeps both connections open for more than 120 seconds, or
The server doesn’t close both connections within a 5 minute period (as WAS limits slow tests to 5 minutes only).
WAS assumes that if it is possible to either keep the connection open with an unfinished request for longer than 120 seconds or, even better, prolong the unfinished connection by sending a byte per T – 1 (or any random value less than T), then it’s possible to acquire all server sockets or threads within that interval.
WAS also performs a supplemental test to determine unhealthy behavior in handling POST requests, by sending a simple POST request to the base URI with a relatively large message body (65543 Kbytes). The content of the body is a random set of ASCII characters, and the content type is set to application/x-www-form-urlencoded. WAS assumes that if the server blindly accepts that request, e.g. responds with 200, then it gives an attacker more opportunity to prolong the slow connection by sending one byte per T – 1. Multiplying 65543 by the T – 1 would give you the length of time an attacker could keep that connection open. QID 150086 is reported on detection of that behavior.
Tests performed by WAS are passive and as non-intrusive as possible, which minimizes the risk of taking down the server. But because of the possibility of false positives, care should be taken, especially if the HTTP server or IPS (Intrusion Prevention System) is configured to change data processing behavior if a certain number of suspicious requests are detected. If you are interested in active testing, which might take your server down, you can try some active testing using one of these available tools:
Mitigation of slow HTTP attacks is platform specific, so it’d be nice for the community to share mitigation techniques in the comments below. I’ll post an update with information on some of those platforms, as well as general recommendations that can be extrapolated to particular platforms.