Text and strings form the building blocks of web apps. Developers and content creators mix text with other media, code, and HTML to produce all kinds of apps for our browsers. However, when developers mix text with code or they carelessly place strings inside of HTML they expose the app to one of the most common web-related vulns: HTML Injection, a.k.a. Cross-Site Scripting (XSS). One way this happens is when developers use string concatenation to piece together a web page with static HTML and user-supplied data. For example, think of a site’s search function. When you submit a search request, the site responds with something like, "Here are the results for XYZ," and lists whatever might have matched. HTML injection occurs when the search term contains markup instead of simple text, and the app treats it like this:
<span>Here are the results for "<script>alert(9)</script>"</span>
Welcome to a new series of resources to accompany the release of our Web Application Scanning (WAS) 3.0 solution. We’ll cover common questions about the solution and, more importantly, we’ll show how scanning ties into the broader challenge of securing web apps. Web apps are usually the most exposed part of an organization’s network. They serve as everything from a fancy brochure with static data, to interactive services where we connect with others, to apps that collect personal data, to apps that generate millions of dollars in revenue. All of these apps carry risk for an organization. Not only is it important to know how secure an app is, it’s important to know where all of your apps are in the first place.
And there’s a huge number of web sites to secure. Netcraft’s May 2013 internet survey shows that we’re getting close to the 700 million mark of active web sites. Sure, a lot of those sites have spam links, phishing, malware, and stale content; but far more have apps that need to be secure in order to protect our data, our money, and their owner’s networks.
In between writing lines of code, I try to twist my fingers into typing more human-friendly output. In this case, it’s a new book on web security called, simply enough, Hacking Web Apps. It explains several web security weaknesses and vulnerabilities, from HTML injection to protecting passwords, to design issues that lead to CSRF, clickjacking and more.
Most well-known web compromises tend to stem from HTML injection (cross-site scripting, or XSS) or SQL injection. After all, those vulns tend to be easy-to-find and deliver high-impact results like site defacement or stealing millions of passwords. By now, we have tools like sqlmap and BeEF that strip most of the mystery from how these vulns are exploited. Ask someone experienced in web security how easy it is to find XSS and they’ll probably call it child’s play. Check out the OWASP Top 10 and you’ll see XSS detectability rated as easy.
But HTML injection continues to infest sites regardless of their size or sophistication, which seems to imply that its detectability might not be so easy after all. Maybe XSS remains unknown to the huge population of developers building web sites, or maybe the increasing complexity of sites makes security exponentially harder to maintain. Maybe it’s hard to evaluate the tens of millions of sites on the web when there might not even be tens of thousands of people capable of doing it well. At the very least, more education should help.
The book explains how XSS shows up in unexpected places, giving you hints on what to look for in your own site as well as things to consider when coding countermeasures (hint: regular expressions are tough to get right). Even sites with well-informed developers and experienced security teams have this problem.
And those unexpected places? The HTML injection chapter hacked Amazon right from the printed page. A Gutenberg Press Injection attack, if you will.
Then there are hacks like cross-site request forgery (CSRF) and clickjacking that blur the line between tools and manual testing. The search pages for Bing, Google and Yahoo are all, strictly speaking, vulnerable to CSRF. Manual analysis is required to assess the relative risks in such cases and consider whether certain threats are worth addressing. This is the engineering side of security: weighing trade-offs between performance, complexity, threats and risks. Learning about the kinds of design problems that lead to insecure sites helps you avoid them in the future.
Different design problems are covered in the book, as well as the mistakes that happen when good design is betrayed by poor implementation. What if a site lets you apply a discount code multiple times? What if it lets you modify the email recipient for password reset instructions? What if it encourages you to create a long passphrase, but only uses the first eight characters? These sorts of problems are harder, if not impossible, to find with any automated tool. This is why it’s good to stay informed about web security beyond the simple XSS and SQL injection vulns we hear about so often.
And if you’re still unconvinced about the importance of web security, consider this paragraph from the introduction:
On the web information equals money. Credit cards clearly have value to hackers; underground "carder" sites have popped up that deal in stolen cards; complete with forums, user feedback, and seller ratings. Yet our personal information, passwords, email accounts, on-line game accounts, and so forth all have value to the right buyer, let alone the value we personally place in keeping such things private. Consider the murky realms of economic espionage and state-sponsored network attacks that have popular attention and grand claims, but a scarcity of reliable public information. (Not that it matters to web security that "cyberwar" exists or not; on that topic we care more about WarGames and Wintermute for this book.) It’s possible to map just about any scam, cheat, trick, ruse, and other synonyms from real-world conflict between people, companies, and countries to an analogous attack executed on the web. There’s no lack of motivation for trying to gain illicit access to the wealth of information on the web, whether for glory, country, money, or sheer curiosity.
Hacking Web Apps aims to give you a feeling for how hackers exploit web sites along with examples and details about each vuln’s inner workings. Whether you’re developing a web application, or are just curious how hackers take apart web sites, there should be something in there for you.
Interest in the QualysGuard Web Application Scanning (WAS) module has been growing since its new UI was demonstrated last week at BlackHat. Along with such interest come questions about how the scanner works. The ultimate goal for WAS is to provide accurate, scalable testing for the most common, highest profile vulnerabilities (think of SQL injection and XSS) so that manual testing can skip the tedious and time-consuming aspects of an app review and focus on complex vulns that require brains rather than RAM.
One complex vuln in particular is CSRF. Automated, universal CSRF detection is a tough challenge, which is why we try to solve the problem in pieces rather than all at once. It’s the type of challenge that keeps web scanning interesting. Here’s a brief look at the approach we’ve taken to start bringing CSRF detection into the realm of automation.
First, the test assumes an authenticated scan. If the scan is not given credentials, then the tests won’t be performed. Also, tests are targeted to specific manifestations of CSRFrather than the broad set of attacks possible from our friendly sleeping giant.
Tests roughly follow these steps. Fundamentally, we’re trying to model an attack rather than make inferences based on pattern matching:
1. Identify forms with a "session context". This is a weaker version of (but hopefully a subset of) a "security context", because lots of times security requires knowledge about the boundaries within an app and the authorized actions of a user. This knowledge is hard to come by automatically. Never the less, some utility can be had by looking at forms with the following attributes:
Only available to an authenticated user.
Are not "trivial" such as search forms or logout buttons.
Have an observable effect, either on the session or the HTTP response. (Hint: Here’s where the automated scan becomes narrow, meaning prone to false negatives.)
2. Set up two separate sessions for the user (i.e. login twice). Keep their cookie jars apart. We’ll refer to the sessions as Aardvark and Bobcat (or A & B or Alpha & Bravo, etc.). Remember, this is for a single user.
3. Obtain a form for session Aardvark.
4. Obtain a form for session Bobcat.
5. Swap the forms between the two sessions and submit. (Crossing the streams, like Egon told you not to do.)
The assumption is that any CSRF tokens in Aardvark’s form are tied to the session cookie(s) used by Aardvark and Bobcat’s belong to Bobcat. Things should blow up if the tokens and session don’t match.
6. Examine the "swapped" responses.
If the response has a clear indication of error, then the app is more likely to be protected from CSRF. The obvious error is something like, "Invalid CSRF token". Sadly, the world is not unicorns and rainbows for automated scanning and errors may not be so obvious or point so directly to CSRF.
If the response is similar to the one received from the original request, then it appears that the form is not coupled to a user’s session. This is an indicator that the form is more probably vulnerable to CSRF.
What it won’t do, because these techniques are noisy and unreliable (as opposed to subtle and quick to anger):
Look for hidden form fields with names or values that match CSRF tokens. If an obvious token is present, that doesn’t mean the app is actually validating it.
Use static inspection of the form, DOM, or HTML to look for any examples of CSRF tokens. Why look for text patterns when you’re trying to determine a behavior? Not everything is solved by regexes. (Which really is unfortunate, by the way.)
Attempt to evaluate the predictability of anything that looks like a CSRF token.
Nor will it demonstrate the compounding factor of CSRF onother vulnerabilities like XSS. That’s something that manual pen-testing should do. In other words, WAS is focused on identifying vulns (it should find an XSS vuln, but it won’t tie the vuln to a CSRF attack to demonstrate a threat). Manual pen-testing more often focuses on how deep an app can be compromised — and the real risks associated with it.
What it’ll miss:
Situations where sessions cookie(s) are static or relatively static for a user. This impairs the "swap" test.
CSRF that can affect unauthenticated users in a meaningful way. This is vague, but as you read more about CSRF you’ll find that some people consider any forgeable action should be considered a vuln. This speaks more to the issue of evaluating risk. You should be relying on people to analyze risk, not tools.
CSRF that affects the user’s privacy. This requires knowledge of the app’s policy and the impact of the attack.
Forms whose effect on a user’s security context manifests in a different response, or in a manner that isn’t immediately evident.
CSRF tokens in the header, which might lead to false positives.
CSRF vulns that manifest via links rather thanforms. Apps put all kinds of functionality in hrefs rather than explicit form tags.
Other situations where we play games of anecdotes and what-ifs.
What we are trying to do:
Reduce noise. Don’t report vulns for the sake of reporting a vuln if no clear security context or actionable data can be provided.
Provide a discussion point so we can explain thebenefits of automated web scanning and point out where manual follow-up will always be necessary.
Learn how real-world web sites implement CSRF in order to find common behaviors that might be detectable via automation. You’d be surprised (maybe) at how often apps have security countermeasures that look nothing like OWASP recommendations and, consequently, fare rather poorly.
Experiment with pushing the bounds of what automation can do, while avoiding hyperbolic claims that automation solves everything.
The current state of CSRF testing in WAS should be relied on as a positive indicator (vuln found, vuln exists) more so than a negative indicator (no vuln found, no vulns exist). That’s supposed to mean that a CSRF vuln reported by WAS should not be a false positive and should be something that the app’s devs need to fix. It also means that if WAS doesn’t find a vuln then the app may still have CSRF vulns. For this particular test a clean report doesn’t mean a clean app; there’re simply too many ways of looking at the CSRF problem to tackle it all at once. We’re trying to break the problem down into manageable parts in order to understand what approaches work. We want to hear your thoughts and feedback on this.
By now it’s common practice for web sites to serve login pages over HTTPS in order to send passwords over an encrypted channel. Yet if the site unleashes the authenticated user back onto HTTP links (no "S"), then protecting the password may be a moot point.
From a web application’s point of view, your initial identity is proved by submitting valid credentials, but your identity in subsequent requests is tied to one or more "session tokens" — basically temporary cookies that are supposed to be unique to your browser. The following video demonstrates what happens when your browser’s unencrypted traffic is intercepted by a sniffer (like using a Wi-Fi connection in a cafe, library, airport, or even at home).
You can find a longer explanation of this problem (without getting tripped up in technical details) in one of my articles on Mashable.