Qualys today announced that QualysGuard WAS won the 2013 Information Security™ Magazine (ISM) and SearchSecurity.com™ Reader’s Choice Award for Application Security. The eighth annual Readers’ Choice Awards surveyed Information Security readers, asking for their votes and ratings on top security products in 19 categories, with participants asked to only vote on the products in use in their environments. The QualysGuard WAS cloud service was rated the highest in the Application Security category for helping enterprises identify and remediate issues before attackers can find and exploit them.
Text and strings form the building blocks of web apps. Developers and content creators mix text with other media, code, and HTML to produce all kinds of apps for our browsers. However, when developers mix text with code or they carelessly place strings inside of HTML they expose the app to one of the most common web-related vulns: HTML Injection, a.k.a. Cross-Site Scripting (XSS). One way this happens is when developers use string concatenation to piece together a web page with static HTML and user-supplied data. For example, think of a site’s search function. When you submit a search request, the site responds with something like, "Here are the results for XYZ," and lists whatever might have matched. HTML injection occurs when the search term contains markup instead of simple text, and the app treats it like this:
<span>Here are the results for "<script>alert(9)</script>"</span>
Welcome to a new series of resources to accompany the release of our Web Application Scanning (WAS) 3.0 solution. We’ll cover common questions about the solution and, more importantly, we’ll show how scanning ties into the broader challenge of securing web apps. Web apps are usually the most exposed part of an organization’s network. They serve as everything from a fancy brochure with static data, to interactive services where we connect with others, to apps that collect personal data, to apps that generate millions of dollars in revenue. All of these apps carry risk for an organization. Not only is it important to know how secure an app is, it’s important to know where all of your apps are in the first place.
And there’s a huge number of web sites to secure. Netcraft’s May 2013 internet survey shows that we’re getting close to the 700 million mark of active web sites. Sure, a lot of those sites have spam links, phishing, malware, and stale content; but far more have apps that need to be secure in order to protect our data, our money, and their owner’s networks.
In the last 10 years hackers have routinely taken advantage of web application vulnerabilities to successfully breach organizations. During that time there has also been an increase in the use of malware hosted on websites to infect unsuspecting users. Many times a vulnerability in a website is exploited to setup the malware that a legitimate website may then deliver. The relationship between website vulnerabilities and malware is one that is likely to continue to increase. The Verizon 2012 Data Breach Investigation Report notes: “Out of the 855 incidents this year, 81% leveraged hacking, 69% included malware, and an impressive 61% of all breaches featured a combination of hacking techniques and malware.“ The combination of threats presents organizations with a new challenge when trying to identify both web application vulnerabilities and malware that may exist on their sites.
WAS 3.0 Now Includes Malware Detection
In announcing Qualys WAS 3.0, organizations now have away to address the need to detect not only web application vulnerabilities, but also malware that may infect these same web properties. Qualys has offered the Malware Detection Service as both a freemium solution for individuals and as an Enterprise Edition for larger organizations that need to protect a large number of websites. With WAS 3.0, Qualys has integrated this malware detection capability into the WAS solution to make it easy for organizations to configure both types of scans from WAS. Although the services are integrated, the way the scanning is executed and the threats that are identified by vulnerability scanning and malware detection are very different. Now that we’ve outlined the integration at a high level, lets go into a little more detail about how each type of scanning is performed, and how they are integrated in WAS 3.0.
Advanced Detection Methods
Qualys WAS works by interacting with a web application or website just as a user would, but does so in a fully automated way. The WAS scanner will request pages, login and follow links to new pages, posting forms as it goes. While interacting with the site, WAS injects hostile payloads and then observes how the web application responds. If the web application responds in a given way, WAS can determine that the application is vulnerable to the specific attack used. This is the same type of action that a penetration tester or malicious hacker would use to discover these vulnerabilities, but it is done in an automated way that takes a fraction of the time that it does for a human. This makes the automated scanning far more cost effective.
The Malware Detection Service (MDS) takes a different approach to detection. While MDS also automatically interacts with the site as a user would, navigating and following links, it uses a much different technique to detect malware. MDS primarily uses Behavioral Analysis to identify when a website is infected with malware. MDS actually interacts with websites using an instrumented vulnerable browser hosted on a virtual machine. The virtual machine is created on the fly when the scan is requested. The service then navigates the site automatically, and by observing the behavior of the system the web browser is hosted on via the instrumentation, the service can detect when activities associated with malware take place. So instead of looking for something specific in the content itself, it is instead looking for what the browser and host do when pages are loaded. Behavioral analysis is superior in many ways to traditional malware detection methods because it can identify even zero-day malware that has not yet been analyzed to create a detection signature.
Look for Malware Close to Home
Now that we have discussed how each service works and the differences in their detection techniques, we can discuss some other differences in web application vulnerability testing and malware detection. While web application vulnerabilities can be expected in both internal and external web applications, website malware is typically found only on Internet facing web applications. This is because malware distributors want to infect as many users as possible, so hosting malware on sites that have the most exposure is the most effective approach. Malware is also typically not likely to be distributed evenly on websites. Unlike vulnerabilities, which are usually distributed throughout a web application because they are unintentional, malware is usually found on pages that are closer to the home page for the site. This is also to take advantage of the traffic patterns of users that will typically be heavier in the pages that are the fewest clicks away from the default for the site. So while WAS will scan a large number of pages to ensure it is testing all the relevant functions thatmay be vulnerable, MDS will seek to test just the unauthenticated pages that are closest to the default page for a site, where malware is most likely to be found.
Another difference between the services that is important to note is the impact. While WAS is actively testing websites by injecting payloads and therefore usually requires advanced notice or scheduling in a maintenance window for production sites, the MDS service is purely observational and does not have any more impact on a site than a normal user would. This allows MDS to be run more often and without the notification that is typically required for running a vulnerability scan.
To bring these two services together, Qualys has added the ability to configure malware scanning for external web applications that are licensed in Qualys WAS. When an Internet accessible web application is being setup in WAS, the user will now have the option to indicate that they would like it to be scanned for malware. This will setup a daily scheduled malware scan for the site. If any malware is discovered the MDS service notifies the subscription owner with an email outlining the issue. Users can login to Qualys to see more details if needed. So now organizations have an easy way to combine Web Application Scanning and Malware Detection to ensure that their Internet facing websites are free from web application vulnerabilities and malware.
Burp Suite Professional Integration
Last but not least, Qualys WAS 3.0 also introduces an integration with an attack proxy tool used primarily to conduct targeted, manual application penetration and validation testing. While WAS is designed to be fully automated and scalable and is appropriate for the security testing requirements of the majority of applications, there are some web applications that require higher levels of assurance. These applications typically require both automated scanning as well as manual penetration testing activities. In most cases, web application penetration testing is primarily conducted with the use of attack proxy tools. Attack proxies offer a high level of control for skilled users and help to facilitate deep testing when it is required. Recognizing that there is a place for both automated scanning and manual testing, Qualys is moving to combine the best of both approaches by integrating attack proxy tools, enabling customers to benefit from highly automated and scalable scanning while at the same time having access to manual tools when additional exploitation or risk evaluation is required.
Qualys WAS 3.0 takes the first step in this evolution by integrating the scan results from Burp Suite Professional (BSP). BSP is a tool that combines interactive testing capabilities with scanning. Testers who use BSP can scan individual pages as they navigate a web application and discover vulnerabilities as they do so. But BSP is primarily intended for use by a single user. There is no centralized storage for results that would allow them to be shared by multiple users, or to be tracked and trended over time. Qualys seeks to overcome this limitation by providing a way to import the BSP scanner findings. This gives organizations a way to store the findings discovered in the BSP scanner with those discovered by Qualys WAS.
UPDATE: Qualys has released a Burp extension for WAS to make the integration even more seamless and easy to use.
As you can tell, Qualys WAS 3.0 is a major release with a lot of new capabilities that will help organizations better combat the growing threats to their web applications. We’re excited by the new WAS 3.0 and we look forward to getting it into your hands.
On May 26th 2011, a new EU directive was adopted that requires web sites to gain consent from visitors before they can store cookies or other information used to track a user’s actions. While the EU Cookie legislation went into effect last year, the UK’s Information Commissioner’s Office (ICO) set May 26th of 2012 as the enforcement date. The ICO is the body responsible for enforcing the UK regulation, with authority to levy fines on web site owners up to £500,000.
A Better Way to Identify Cookies
In order to comply with the EU regulations and avoid the UK ICO fines, organizations need to understand what cookies their web sites are issuing and the conditions in which they are issued. Most web application scanning solutions will report the cookies that a web site is issuing. This includes cookies that may be issued by 3rd party sites that have embedded content commonly used to track users for marketing purposes. QualysGuard WAS has provided this information for some time via the Information Gathered (IG) QID 150028 (Cookies Collected). However, the way that the cookies are collected for QID 150028 as well as the way other web application scanners gather cookies may lead to the inclusion of cookies that were issued after the scanner automatically triggered the explicit user consent action. This is because web application scanners typically follow all links, including those that are most commonly used to obtain user consent. What organizations that wish to to gain a user’s explicit consent really need is a way to identify only the cookies that are issued without automatically issuing any user consent actions. In order to address this use case, QualysGuard WAS has implemented a new test (QID 150099), which avoids the most common user consent techniques while gathering cookies from the web site. In doing so, QualysGuard WAS provides organizations with information about the cookies they are issuing without the user’s consent.
Steps to Identify Cookies Issued Without User Consent
The best thing about the new test is that WAS includes it as a standard check during all scans run on or after 26 May, 2012. So organizations using WAS do not need to alter their scan configuration to take advantage of the new test.
To view the cookies issued without user consent in QualysGuard WAS, follow these steps:
1. Log into QualysGuard WAS v2 and navigate to the “Scans” tab.
2. Use the filter panel date selection on the lower left to filter scans to those run on or after 5/26/2012
3. Select the scan and using the quick action menu, choose “View Report”
4. Once the report is open, select the “Results” tab
5. In the filter panel on the left,check the box next to the ‘1’ in the “Information Gathered Levels” section
6. You should see a number of results that are level 1 Information Gathered (IG) items – click on the one with QID 150099. This will show you the cookies that were identified as being issued without any user consent.
Compliance with the Regulation
If you identify any cookies that are subject to the regulation that require user consent – the next step is to set up specific user consent prompts within the web site such that the user can make an informed decision whether to accept or reject the cookies that are being set. The implementation of this will usually take the form of a pop up dialog or prompt, but the actual implementation details will vary based on the organization.
See more information on the new EU Cookie Law and the UK ISO enforcement.
Your PCI 11.2 Checklist and Toolbox
Merchants are getting ready for the upcoming changes to the internal scanning requirements for PCI compliance. This blog post provides a checklist on what you should have ready and will review some of the tools Qualys provides for these requirements.
There are four core areas to focus on in preparation for your compliance to PCI 11.2, taking into account the changes from PCI 6.2 regarding risk ranking of vulnerabilities.
- Your documented PCI scope (cardholder dataenvironment)
- Your documented risk ranking process
- Your scanning tools
- Your scan reports
Merchants will need to complete each of these elements to be prepared to pass PCI compliance.
1. Your documented PCI scope (cardholder data environment)
All PCI requirements revolve around a cross-section of assets in your IT infrastructure that is directly involved in storage, processing, or transmitting payment card information. These IT assets are known as the cardholder data environment (CDE), and are the focus areas of the PCI DSS requirements.
These assets can exist in internal or external (public) networks and may be subject to different requirements based on what role they play in payment processing. These assets can be servers, routers, switches, workstations, databases, virtual machines or web applications; PCI refers to these assets as system components.
QualysGuard provides a capability to tag assets under management. The screenshot below shows an example of PCI scope being defined within the QualysGuard Asset Tagging module. It provides the ability to group internal assets (for 11.2.1), external assets (for 11.2.2), and both internal and external assets together (for 11.2.3).
This allows you to maintain documentation of your CDE directly, and to drive your scanning directly from your scope definition.
2. Your documented risk ranking process
This is the primary requirement associated with the June 30th deadline; this is the reference that should allow someone to reproduce your risk rankings for specific vulnerabilities.
The requirement references industry best practices, among other details, to consider in developing your risk ranking. It may help you to quickly adopt a common industry best practice and adapt it to your own environment. Two examples are the Qualys severity rating system, which is the default rating as per the security research team at Qualys; or, the PCI ASV Program Guide, which includes a rating system used by scanning vendors to complete external scanning. QualysGuard is used by 50 of the Forbes Global 100, and spans all market verticals; it qualifies as an industry best practice. Additionally, the QualysGuard platform is used by the majority of PCI Approved Scanning Vendors and already delivers rankings within the PCI ASV Program Guide practices.
The core rules of your risk rankings should take into account CVSS Base Scores, available from nearly all security intelligence feeds. These scores are also the base system used within the PCI ASV Program Guide. Your process should also account for system components in your cardholder data environment and vendor-provided criticality rankings, such as the Microsoft patch ranking system if your CDE includes Windows-based system components.
The process should include documentation that details the sources of security information you follow, how frequently you review the feeds, and how you respond to new information in the feeds. QualysGuard provides daily updates to the vulnerability knowledgebase and now offers a Zero-Day Analyzer service, which leverages data from the iDefense security intelligence feed.
3. Your scanning tools
After you have your scope clearly defined and you have your process for ranking vulnerabilities documented, you will need to be able to run vulnerability scans. This includes internal VM scans, external VM scans, PCI ASV scans (external), internal web application scans and external web application scans. It is thefindings in these scans that will map against your risk ranking process and allow you to produce the necessary scan reports.
You will need to be able to configure your scanning tools to check for “high” vulnerabilities, which will allow you to allocate resources to fix and resolve these issues as part of the normal vulnerability management program and workflow within your environment.
QualysGuard VM, QualysGuard WAS and QualysGuard PCI all work together seamlessly to provide each of these scans capabilities against the same group of assets that represent your PCI scope or CDE.
4. Your scan reports
You will want to produce reports for your internal PCI scope, as defined in #1 of this checklist, both quarterly and after any significant changes. If you have regular releases or updates to your IT infrastructure, you will want to have scan reports from those updates and upgrades. Quarterly scan reports need to be spaced apart by 90 days. In all cases, these reports need to show that there are no “high” vulnerabilities detected by your scanning tools.
Each report for the significant change events will also need to include external PCI scope. QualysGuard VM makes it easy to include both internal and external assets in the same report. QualysGuard VM also provides a direct link to your QualysGuard PCI merchant account for automation of your PCI ASV scan requirements.
QualysGuard WAS allows you to quickly meet your production web application scanning requirement (PCI 6.6) as well as internal web application scanning as part of your software development lifecycle (SDLC), by scanning your applications in development and in test.
If you follow these guidelines you will be well prepared to perform and maintain the required controls for PCI 11.2.
Imagine a line at a fast food restaurant that serves two types of burgers, and a customer at the cashier is stuck for a while deciding what he wants to order, making the rest of the line anxious, slowing down the business. Now imagine a line at the same restaurant, but with a sign saying “think ahead of your order,” which is supposed to speed things up. But now the customer orders hundreds of burgers, pays, and the line is stuck again, because he can take only 5 burgers at time to his car, making signs ineffective.
While developing the slowhttptest tool, I thought about this burger scenario, and became curious about how HTTP servers react to slow consumption of their responses. There are so many conversations about slowing down requests, but none of them cover slow responses. After spending a couple of evenings implementing proof-of-concept code, I pointed it to my so-many-times-tortured Apache server and, surprisingly, got a denial of service as easily as I got it with slowloris and slow POST.
Let me remind you what slowloris and slow POST are aiming to do: A Web server keeps its active connections in a relatively small concurrent connection pool, and the above-mentioned attacks try to tie up all the connections in that pool with slow requests, thus causing the server to reject legitimate requests, as in first reastaurnt scenario.
The idea of the attack I implemented is pretty simple: Bypass policies that filter slow-deciding customers, send a legitimate HTTP request and read the response slowly, aiming to keep as many connections as possible active. Sounds too easy to be true, right?
Crafting a Slow Read
Let’s start with a simple case, and send a legitimate HTTP request for a resource without reading the server’s response from the kernel receive buffer.
We craft a request like the following:
GET /img/delivery.png HTTP/1.1 Host: victim User-Agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.0; U; Edition MacAppStore; en) Presto/2.9.168 Version/11.50 Referer: http://code.google.com/p/slowhttptest/
And the server replies with something like this:
HTTP/1.1 200 OK Date: Mon, 19 Dec 2011 00:12:28 GMT Server: Apache Last-Modified: Thu, 08 Dec 2011 15:29:54 GMT Accept-Ranges: bytes Content-Length: 24523 Content-Type: image/png ?PNG
The simplified tcpdump output looks like this:
09:06:02.088947 IP attacker.63643 > victim.http: Flags [S], seq 3550589098, win 65535, options [mss 1460,nop,wscale 1,nop,nop,TS val 796586772 ecr 0,sackOK,eol], length 0 09:06:02.460622 IP victim.http > attacker.63643: Flags [S.], seq 1257718537, ack 3550589099, win 5792, options [mss 1460,sackOK,TS val 595199695 ecr 796586772,nop,wscale 6], length 0 09:06:02.460682 IP attacker.63643 > victim.http: Flags [.], ack 1, win 33304, length 0 09:06:02.460705 IP attacker.63643 > victim.http: Flags [P.], seq 1:219, ack 1, win 33304, length 218 09:06:02.750771 IP victim.http > attacker.63643: Flags [.], ack 219, win 108, length 0 09:06:02.762162 IP victim.http > attacker.63643: Flags [.], seq 1:1449, ack 219, win 108, length 1448 09:06:02.762766 IP victim.http > attacker.63643: Flags [.], seq 1449:2897, ack 219, win 108, length 1448 09:06:02.762799 IP attacker.63643 > victim.http: Flags [.], ack 2897, win 31856, length 0 ... ... 09:06:03.611022 IP victim.http > attacker.63643: Flags [P.], seq 24617:24738, ack 219, win 108, length 121 09:06:03.611072 IP attacker.63643 > victim.http: Flags [.], ack 24738, win 20935, length 0 09:06:07.757014 IP victim.http > attacker.63643: Flags [F.], seq 24738, ack 219, win 108, length 0 09:06:07.757085 IP attacker.63643 > victim.http: Flags [.], ack 24739, win 20935, length 0 09:09:54.891068 IP attacker.63864 > victim.http: Flags [S], seq 2051163643, win 65535, length 0
For those who don’t feel like reading tcpdump’s output: We established a connection; sent the request; received the response through several TCP packets sized 1448 bytes because of Maximum Segment Size that the underlying communication channel supports; and finally, 5 seconds later, we received the TCP packet with the FIN flag.
Everything seems normal and expected. The server handed the data to its kernel level send buffer, and the TCP/IP stack took care of the rest. At the client, even while the application had not read yet from its kernel level receive buffer, all the transactions were completed on the network layer.
What if we try to make the client’s receive buffer very small?
We sent the same HTTP request and server produced the same HTTP response, but tcpdump produced much more interesting results:
13:37:48.371939 IP attacker.64939 > victim.http: Flags [S], seq 1545687125, win 28, options [mss 1460,nop,wscale 0,nop,nop,TS val 803763521 ecr 0,sackOK,eol], length 0 13:37:48.597488 IP victim.http > attacker.64939: Flags [S.], seq 3546812065, ack 1545687126, win 5792, options [mss 1460,sackOK,TS val 611508957 ecr 803763521,nop,wscale 6], length 0 13:37:48.597542 IP attacker.64939 > victim.http: Flags [.], ack 1, win 28, options [nop,nop,TS val 803763742 ecr 611508957], length 0 13:37:48.597574 IP attacker.64939 > victim.http: Flags [P.], seq 1:236, ack 1, win 28, length 235 13:37:48.820346 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:37:49.896830 IP victim.http > attacker.64939: Flags [P.], seq 1:29, ack 236, win 98, length 28 13:37:49.896901 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:37:51.119826 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:37:51.119889 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:37:55.221629 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:37:55.221649 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:37:59.529502 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:37:59.529573 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:38:07.799075 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:38:07.799142 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:38:24.122070 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:38:24.122133 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:38:56.867099 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:38:56.867157 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:40:01.518180 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:40:01.518222 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:42:01.708150 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:42:01.708210 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:44:01.891431 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:44:01.891502 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:46:02.071285 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:46:02.071347 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:48:02.252999 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:48:02.253074 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0 13:50:02.436965 IP victim.http > attacker.64939: Flags [.], ack 236, win 98, length 0 13:50:02.437010 IP attacker.64939 > victim.http: Flags [.], ack 29, win 0, length 0
In the initial SYN packet, the client advertised its receive window size as 28 bytes. The server sends the first 28 bytes to the client and that’s it! The server keeps polling the client for space available at progressive intervals until it reaches a 2-minute interval, and then keeps polling at that interval, but keeps receiving win 0.
This is already promising: if we can prolong the connection lifetime for several minutes, it’s not that bad. And we can have more fun with thousands of connections! But fun did not happen. Let’s see why: Once the server received the request and generated the response, it sends the data to the socket, which is supposed to deliver it to the end user. If the data can fit into the server socket’s send buffer, the server hands the entire data to the kernel and forgets about it. That’s what happened with our last test.
What if we make the server keep polling the socket for write readiness? We get exactly what we wanted: Denial of Service.
Let’s summarize the prerequisites for the DoS:
- We need to know the server’s send buffer size and then define a smaller-sized client receive buffer. TCP doesn’t advertise the server’s send buffer size, but we can assume that it is the default value, which is usually between 65Kb and 128Kb. There’s normally no need to have a send buffer larger than that.
- We need to make the server generate a response that is larger than the send buffer. With reports indicating the Average Web Page Approaches 1MB, that should be fairly easy. Load the main page of the victim’s Web site in your favorite WebKit-based browser like Chrome or Safari and pick the largest resource in Web Inspector.
If there are no sufficiently large resources on the server, but it supports HTTP pipelining, which many Web servers do, then we can multiply the size of the response to fill up the server’s send buffer as much as we need by re-requesting same resource several times using the same connection.
For example, here’s a screenshot of mod_status on Apache under attack:
As you can see, all connections are in the WRITE state with 0 idle workers.
Here is the chart generated by the latest release of slowhttptest with Slow Read attack support:
Even though the TCP/IP stack shouldn’t make decisions on resetting alive and responsive connections, and it is the “userland” application’s responsibility to do so, I assume that some TCP/IP implementations or firewalls might have timers to track connections that cannot send data for some time. To avoid triggering such decisions, slowhttptest can read data from the local receive buffer very slowly to make the TCP/IP stack reply with ACKs with window size other than 0, thus ensuring some physical data flow from server to client.
While I was implementing the attack, I contacted Ivan Ristic to get his opinion and suggestions. I was suspecting there would be attacks exploiting zero/small window, but I thought I am the first one to apply it to tie up an HTTP server. I was surprised when Ivan replied with links to sockstress by Outpost24 and Nkiller2 exploiting Persist Timer infiniteness that are already covering most aspects I wanted to describe. However, the above mentioned techniques are crafting TCP packets and use raw sockets, whereas slowhttptest uses only the TCP sockets API to achieve almost the same functionality.
We still think it’s worthwhile to have a configurable tool to help people focus and design defense mechanisms, since this vulnerability still exists on many systems three years after it was first discovered, and I consider Slow Read DoS attacks are even lower profile and harder to detect than slowloris and slow POST attacks.
For passive (non-intrusive) detection of vulnerability, the presence of several conditions could be checked:
- The server accepts initial SYN packets with an abnormally small advertised window
- The server doesn’t send RST or FIN for some time (30 seconds should be more than enough), if recipient cannot accept the data
- Persistent connections (keep-alive) and HTTP pipelining are enabled
If all three conditions are met, we can assume server is vulnerable to Slow Read DoS attack. QualysGuard Web Application Scanner (WAS) uses similar approach to discover the vulnerability.
For active detection, I would recommend using slowhttptest version 1.3 and up. See installation and usage examples.
All servers I observed (Apache, nginx, lighttpd, IIS 7.5) are vulnerable in their default configuration.
The fundamental problem here is how servers are handling write readiness for active sockets.
The best protection would be:
- Do not accept connections with abnormally small advertised window sizes
- Do not enable persistent connections and HTTP pipelining unless performance really benefits from it
- Limit the absolute connection lifetime to some reasonable value
Some servers have built-in protection, which is turned off by default. For example, lighttpd has the server.max-write-idle option to specify maximum number of seconds until a waiting write call times out and closes the connection.
Apache is vulnerable in its default configuration, but MPM Event, for example, handles slow requests and responses significantly better than other modules, but falls back to worker MPM behavior for SSL connections. ModSecurity supports attributes to control how long a socket can remain in read or write state.
There is a handy script called “Flying frog” available, written by Christian Folini, an expert in Application Layer DoS attacks detection. Flying frog is a monitoring agent that hovers over the incoming traffic and the application log. It picks individual attackers, like a frog eats a mosquito.
Update, January 6, 2012:
SpiderLabs came up with details on mitigation:
Slow HTTP attacks are denial-of-service (DoS) attacks in which the attacker sends HTTP requests in pieces slowly, one at a time to a Web server. If an HTTP request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data. When the server’s concurrent connection pool reaches its maximum, this creates a DoS. Slow HTTP attacks are easy to execute because they require only minimal resources from the attacker.
In this article, I describe several simple steps to protect against slow HTTP attacks and to make the attacks more difficult to execute.
Previous articles in the series cover:
- Identifying Slow HTTP Attack Vulnerabilities on Web Application
- New Open-Source Tool for Slow HTTP DoS Attack Vulnerabilities
- Testing Web Servers for Slow HTTP Attacks
To protect your Web server against slow HTTP attacks, I recommend the following:
- Reject / drop connections with HTTP methods (verbs) not supported by the URL.
- Limit the header and message body to a minimal reasonable length. Set tighter URL-specific limits as appropriate for every resource that accepts a message body.
- Set an absolute connection timeout, if possible. Of course, if the timeout is too short, you risk dropping legitimate slow connections; and if it’s too long, you don’t get any protection from attacks. I recommend a timeout value based on your connection length statistics, e.g. a timeout slightly greater than median lifetime of connections should satisfy most of the legitimate clients.
- The backlog of pending connections allows the server to hold connections it’s not ready to accept, and this allows it to withstand a larger slow HTTP attack, as well as gives legitimate users a chance to be served under high load. However, a large backlog also prolongs the attack, since it backlogs all connection requests regardless of whether they’re legitimate. If the server supports a backlog, I recommend making it reasonably large to so your HTTP server can handle a small attack.
- Define the minimum incoming data rate, and drop connections that are slower than that rate. Care must be taken not to set the minimum too low, or you risk dropping legitimate connections.
Applying the above steps to the HTTP servers tested in the previous article indicates the following server-specific settings:
- Using the <Limit> and <LimitExcept> directives to drop requests with methods not supported by the URL alone won’t help, because Apache waits for the entire request to complete before applying these directives. Therefore, use these parameters in conjunction with the LimitRequestFields, LimitRequestFieldSize, LimitRequestBody, LimitRequestLine, LimitXMLRequestBody directives as appropriate. For example, it is unlikely that your web app requires an 8190 byte header, or an unlimited body size, or 100 headers per request, as most default configurations have.
- Set reasonable TimeOut and KeepAliveTimeOut directive values. The default value of 300 seconds for TimeOut is overkill for most situations.
- ListenBackLog’s default value of 511 could be increased, which is helpful when the server can’t accept connections fast enough.
- Increase the MaxRequestWorkers directive to allow the server to handle the maximum number of simultaneous connections.
- Adjust the AcceptFilter directive, which is supported on FreeBSD and Linux, and enables operating system specific optimizations for a listening socket by protocol type. For example, the httpready Accept Filter buffers entire HTTP requests at the kernel level.
A number of Apache modules are available to minimize the threat of slow HTTP attacks. For example, mod_reqtimeout’s RequestReadTimeout directive helps to control slow connections by setting timeout and minimum data rate for receiving requests.
I also recommend switching apache2 to experimental Event MPM mode where available. This uses a dedicated thread to handle the listening sockets and all sockets that are in a Keep Alive state, which means incomplete connections use fewer resources while being polled.
- Limit accepted verbs by checking the $request_method variable.
- Set reasonably small client_max_body_size, client_body_buffer_size, client_header_buffer_size, large_client_header_buffers, and increase where necessary.
- Set client_body_timeout, client_header_timeout to reasonably low values.
- Consider using HttpLimitReqModule and HttpLimitZoneModule to limit the number of requests or the number of simultaneous connections for a given session, or as a special case, with same address.
- Configure worker_processes and worker_connections based on the number of CPU / cores, content and load. The formula is max_clients = worker_processes * worker_connections.
- Restrict request verbs using the $HTTP["request-method"] field in the configuration file for the core module (available since version 1.4.19).
- Use server.max_request-size to limit the size of the entire request including headers.
- Set server.max-read-idle to a reasonable minimum so that the server closes slow connections. No absolute connection timeout option was found.
- Set connectionTimeout, HeaderWaitTimeout and MaxConnections properties in Metabase to minimize the impact of slow HTTP attacks. Working with Metabase can be complicated, so I recommend Microsoft’s Working with the Metabase reference guide.
- Limit request attributes is through the <RequestLimits> element, specifically the maxAllowedContentLength, maxQueryString, and maxUrl attributes.
- Set <headerLimits> to configure the type and size of header your web server will accept.
- Tune the connectionTimeout, headerWaitTimeout, and minBytesPerSecond attributes of the <limits> and <WebLimits> elements to minimize the impact of slow HTTP attacks.
The above are the simplest and most generic countermeasures to minimize the threat. Tuning the Web server configuration is effective to an extent, although there is always a tradeoff between limiting slow HTTP attacks and dropping legitimately slow requests. This means you can never prevent attacks simply using the above techniques.
Beyond configuring the web server, it’s possible to implement other layers of protection like event-driven software load balancers, hardware load balancers to perform delayed binding, and intrusion detection/prevention systems to drop connections with suspicious patterns.
However, today, it probably makes more sense to defend against specific tools rather than slow HTTP attacks in general. Tools have weaknesses that can be identified and and exploited when tailoring your protection. For example, slowhttptest doesn’t change the user-agent string once the test has begun, and it requests the same URL in every HTTP request. If a web server receives thousands of connections from the same IP with the same user-agent requesting the same resource within short period of time, it obviously hints that something is not legitimate. These kinds of patterns can be gleaned from the log files, therefore monitoring log files to detect the attack still remains the most effective countermeasure.
Following the release of the slowhttptest tool, I ran benchmark tests of some popular Web servers. My testing shows that all of the observed Web servers (and probably others) are vulnerable to slow http attacks in their default configurations. Reports generated by the slowhttptest tool illustrate the differences in how the various Web servers handle slow http attacks.
Tests were run against the default, out-of-the-box configurations of the Web servers, which is the best level playing field for comparison. And while most deployments will customize their configuration, they will likely do it for reasons other than improving protection against slow http attacks. Therefore it’s useful to understand how the default configurations relate to slow http attack vulnerability.
The following HTTP servers were observed:
|Web Server||Operating System|
|Apache 2.2.19 MPM prefork||Mac OS X 10.7 Lion Server|
|nginx 1.0.0||Mac OS X 10.7 Lion Server from MacPorts|
|lighttpd 1.4.28||Ubuntu 11.04|
|IIS 7||Windows 2008 Server|
In addition to noting that all Web servers tested are vulnerable to slow http attacks, I drew some other generalizations about how different Web servers handle slow http attacks:
- Apache, nginx and lighttpd wait for complete headers on any URL for requests without a message body (GET, for example) before issuing a response, even for requests with the verb FAKESLOWVERB.
- Apache also waits for the entire body of requests with fake verbs before issuing a response with an error message.
- For Apache, nginx and lighttpd, slow requests sent with fake verbs consume resources with the same success rate as requests sent with valid verbs, so the hacker doesn’t even need to bother finding a vulnerable URL.
- Server administrators’ scripts typically query for particular expected values like method, or URL, or referer header, etc., but not for fake verbs. That means it is likely that slow http attacks using fake verbs or URLs can go unnoticed by the server administrator.
- Each server except IIS is vulnerable to both slow header and slow message body attacks. IIS is vulnerable only to slow message body attacks.
However, there are some interesting differences in the results as well. The screenshots below, which show the graphical output of the slowhttptest tool, demonstrate how connection state changed during the tests, and illustrate how the various Web servers handle slow http attacks.
Apache MPM prefork:
Apache is generally the most vulnerable, and denial of service can be achieved with 355 connections on the system tested. Apache documentation indicates a 300-second timeout for connections, but my testing indicates this is not enforced.
In the test, the server accepted 483 connections and started processing 355 of them. The 355 corresponds to RLIMIT_NPROC (max user processes), a machine-dependent value that is 709 on the machine tested, times MaxClients, whose default value in httpd.conf is 50%: 355 = 709 * 50%.
The rest of the connections were accepted and backlogged. The limit for backlog is set by the ListenBackLog directive and is 512 in default httpd.conf, but is often limited to a smaller number by operating system, 128 in case of Mac OS X. The 483 connected connections shown in the graph correspond to the 355 being processed plus the 128 in the backlog.
One interesting side note: The Apache documentation says that ListenBackLog is the “maximum length of the queue of pending connections”, but a simple test shows that backlogged connections are being accepted, e.g. server sends SYN-ACK back, and backlogged connections are ready for write operations on the client side, which is not a normal behavior for pending connections. Apache documentation is probably using “pending” to mean the internal connection state in Apache, but I rather expect “pending” to refer to the state in the TCP stack, where “pending” means “waiting to be accepted”.
I terminated the test after 240 seconds, but the picture would be the same for a longer test. A properly configured client can keep connections open for hours, until the limit for headers count or length is met. With such settings, DoS is achieved with N+1 connections, where N is number specified by MaxClients.
While testing, I noticed that my httpd.conf had a TimeOut directive set to 300, and Apache 2.0 documentation says that:
The TimeOut directive currently defines the amount of time Apache will wait for three things:
- The total amount of time it takes to receive a GET request.
- The amount of time between receipt of TCP packets on a POST or PUT request.
- The amount of time between ACKs on transmissions of TCP packets in responses.
I understand from the above paragraph that the opened connections should be closed after the TimeOut interval, i.e. 300 seconds in the default configuration. However, I observed that the connection is closed only if there is no data arriving for 300 seconds, which means this is not an effective preventative measure against DoS. This behavior also indicates that if there are some rules defined to throttle down too many connections with a similar pattern in a shorter period of time, they are also ineffective: An attacker can initiate connections with very low connection rate and get the same results, as the connection can be prolonged virtually forever.
nginx is also vulnerable to slow http attacks, but it offers more controls than Apache.
Surprisingly, the number of initially accepted connections was 377, even with the default settings of worker_connections = 1024, and worker_processes = 1. But worker_rlimit_nofile, the maximum number of open file descriptors per worker that governs the maximum number of connections the worker can accept, has a default value of 377 on Mac OS X.
As shown in the graph above, the server accepts the connections it can accept, and leaves the rest of the connections pending. Due to some hardcoded timeout values, connections are closed after 70 seconds no matter how slow the data is arriving. Nginx is therefore safer than default Apache, but it still gives attacker a chance to achieve DoS for 70 seconds.
The default TCP timeout (75), which closes pending connections, is longer than the nginx timeout (65), which closes accepted connections. This means that nginx moves some pending connections to the accepted state after it times out the first set of accepted connections. This extends the length of time that a batch of slow connection requests can tie up the server.
In any case, a client can always re-establish connections every 65 seconds to keep the server under DoS conditions.
Lighttpd with default configuration is vulnerable to both http attacks, which are fairly easy to carry out.
The default configuration allows a maximum of 480 connections to be accepted, as can be seen in the graph above, with the rest pending for 200 seconds, and then closed by a timeout. Lighttpd has a useful attribute called server.max-read-idle with default value 60, which closes a connection if no data is received before the timeout interval, but sending something to the socket every 59 seconds would reset it, allowing the attacker to keep connections open for a long time.
The lighttpd forums indicate that a fixed issue protects against slow HTTP request handling, it only fixes a waste of memory issue, and hitting the limit of concurrently processing connections is still pretty easy.
IIS 7 offers good protection against slow headers, but this protection does not extend to slow message bodies. Because IIS is architected differently from the other Web servers tested, the behavior it displays is also different.
IIS 7 accepts all connections, but does not consider them writeable until headers sections are received in full. Such connections require fewer resources from IIS, and therefore IIS can maintain a relatively larger pool of these connections. In the default configuration, it is not possible to exhaust the pool with a single slowhttptest run, which is limited to 1024 connections on systems I tested on. However, it would be possible to launch multiple instances of slowhttptest to get around this limitation.
For requests with a slow message body, IIS’ protection is useless, as it’s possible to send complete headers sections but then slow down the message body section. In this case, the connections are transferred to IIS’ internal processing queue, which is limited in size to, don’t be surprised, 100 connections. Even though the screenshot shows 1000 connections, I experimentally figured out that 100 requests with slow message body are enough to get DoS.
Software configuration is all about tradeoffs, and it is normal to sacrifice one aspect for another. We see from the test results above that all default configuration files of the Web servers tested are sacrificing protection against slow HTTP DoS attacks in exchange for better handling of connections that are legitimately slow.
Because a lot of people are not aware of slow http attacks, they will tend to trust the default configuration files distributed with the Web servers. It would be great if the vendors creating distribution packages for Web servers would pay attention to handling and minimizing the impact of slow attacks, as much as the Web servers’ configuration allows it. Meanwhile, if you are running a Web server, be careful and always test your setup before relying on it for production use.
Slow HTTP attacks are denial-of-service (DoS) attacks that rely on the fact that the HTTP protocol, by design, requires a request to be completely received by the server before it is processed. If an HTTP request is not complete, or if the transfer rate is very low, the server keeps its resources busy waiting for the rest of the data. When the server’s concurrent connection pool reaches its maximum, this creates a denial of service. These attacks are problematic because they are easy to execute, i.e. they can be executed with minimal resources from the attacking machine.
Inspired by Robert “Rsnake” Hansen’s Slowloris and Tom Brennan’s OWASP slow post tools, I started developing another open-source tool, called slowhttptest, available with full documentation at https://github.com/shekyan/slowhttptest. Slowhttptest opens and maintains customizable slow connections to a target server, giving you a picture of the server’s limitations and weaknesses. It includes features of both of the above tools, plus some additional configurable parameters and nicely formatted output.
Slowhttptest is configurable to allow users to test different types of slow http scenarios. Supported features are:
- slowing down either the header or the body section of the request
- any HTTP verb can be used in the request
- configurable Content-Length header
- random size of follow-up chunks, limited by optional value
- random header names and values
- random message body data
- configurable interval between follow-up data chunks
- support for SSL
- support for hosts names resolved to IPv6
- verbosity levels in reporting
- connection state change tracking
- variable connection rate
- detailed statistics available in CSV format and as a chart generated as HTML file using Google Chart Tools
How to Use
The tool works out of the box with default parameters, which are harmless and most likely will not cause a denial of service.
and the test begins with the default parameters.
Depending on which test mode you choose, the tool will send either slow headers:
GET / HTTP/1.1CRLF
Host: localhost:80 CRLF
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2)CRLF
. n seconds
. n seconds
. n seconds
or slow message bodies:
POST / HTTP/1.1CRLF
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0.1) Gecko/20100101
. n seconds
. n seconds
Repeated until the server closes the connection or the test hits the specified time limit.
Depending on the verbosity level selected, the slowhttptest tool logs anything from heartbeat messages every 5 seconds to a full traffic dump. Output is available either in CSV format or in HTML for interactive use with Google Chart Tools.
Note: Care should be taken when using this tool to avoid inadvertently causing denial of service against your servers. For production servers, QualysGuard Web Application Scanner will perform passive (non-intrusive) automated tests that will indicate susceptibility to slow http attacks without the risk of causing denial of service.
Example Test and Results
The HTML screenshot below shows the results of running slowhttptest against a test server in a test lab. In this scenario, the tool opens 1000 connections with rate of 200 connections per second, and the server was able to concurrently process only 377 connections, leaving the remaining 617 connections pending. Denial of service was achieved within the first 5 seconds of the test, and lasted 60 seconds, until the server timed out some of the active connections. At this point, the server transferred another set of connections from pending state to active state, thus causing DoS again, until the server timed out those connections.
Figure 1: Sample HTML output of slowhttptest results.
As is shown in the above test, the slowhttptest tool can be used to test a variety of different slow http attacks and to understand the effects they will have on specific server configurations. By having a visual representation of the server’s state, it is easy to understand how the server reacts to slow HTTP requests. It is then possible to adjust server configurations as appropriate. In follow-up posts, I will describe some detailed analysis of different HTTP servers’ behavior on slow attacks and mitigation techniques.
Any comments are highly appreciated, and I will review all feature requests posted on the project page at https://github.com/shekyan/slowhttptest. Many thanks to those who are contributing to this project.
Update, August 26, 2011:
Version 1.1. of slowhttptest includes a new test for the Apache range header handling vulnerability, also known as the “Apache Killer” attack. The usage example can be found at https://github.com/shekyan/slowhttptest/wiki/ApacheRangeTest.
Update, January 5, 2012:
Version 1.3 of slowhttptest includes a new test for the Slow Read DoS attack.