Path Fuzzing Challenges
Last updated on: September 7, 2020
Web application scanners often struggle to scan applications that incorporate parameters into their URL paths, specifically web apps that use URL-rewrite techniques or web apps with REST APIs that take URL parameters. One key approach is to fuzz the application’s URL parameter inputs in order to identify possible injection points for malicious code. But without knowledge of the URL structure, it’s difficult for scanners to fuzz those parameters efficiently and with full coverage, which is required for an effective scan.
In this blog post, we’ll describe how Qualys WAS uses the URI template interface to get knowledge of the URL structure of web applications so that you can scan your web apps quickly and with high accuracy.
Pretty URLs Trip Web App Scanners
To improve usability and search-friendliness, web applications may utilize different URL rewriting techniques. To demonstrate this techniques consider the following URL with query:
http://www.example.com/articles/display.php?issue=17§ion=sports&article=28
This type of URL is difficult for people to memorize and communicate, and lacks significant information for search engines. Therefore, web apps tend to use URLs like the two shown below that are much more people- and search engine-friendly:
http://www.example.com/articles/issue/17/section/sports/article/baseball
http://www.example.com/17/sports/baseball
To generate this kind of “pretty” URL, web applications may use technologies either embedded in their web app frameworks or implemented at the server level such as with Apache mod_rewrite.
REST APIs also incorporate parameters and parameter values into URL paths.
From a scanning perspective, URL rewriting and REST APIs pose the following problems:
- There are many URLs with distinct paths that are essentially identical. If web application scanners know the parameters, they’re able to skip testing redundant pages and reduce scanning time.
- Application scanners are good at fuzzing query parameters and form fields but they don’t necessarily fuzz parameters that are embedded in URL paths. Blindly fuzzing all the path components is not efficient. To provide better fuzzing coverage while retaining efficiency of scanning, the scanners should only tamper with the appropriate path components.
In our first URL example above, it’s clear that “issue”, “section”, and “article” are query parameters and “17”, “sports”, and “28” the parameter values.
In URLs similar to our other two examples, this information is not explicit and not trivial to discern. Therefore, when scanning web applications that utilize “pretty” links, additional information is needed to test the parameter values embedded in the URL paths.
URI Templates Make Qualys WAS Sharper and Faster
One way of communicating the additional information about parameter values is by using URI templates as described in IETP RFC 6570.
While URI templates may serve different purposes not necessarily dealing with web application security, we find them very convenient for compact representation of embedded parameters in the URL paths.
In the case of our second example,
http://www.example.com/articles/issue/17/section/sports/article/baseball,
one could write the URI template as follows:
http://www.example.com/articles/issue/{issue}/section/{section}/article/{article}
The above describes not a single URI but a range of URI’s, of which the URL example is a particular instantiation of the template.
The URI template matching the URL would be: http://www.example.com/articles/{issue}/{section}/{article}
This relation could be used to derive parameter values in the URI from template variables. If the URL (http://www.example.com/articles/issue/17/section/sports/article/baseball) is matching the particular URI template (http://www.example.com/articles/{issue}/{section}/{article}) then the path components from the original URI that match the URI template’s variables are considered to be a parameter value.
In our examples above, we’ve shown the URL as an instance of the template and as a match of the template. In other words, the path components “17”, ”sports” and “baseball” match the variables “issue”, “section” and “article” in the template. WAS will interpret the path components “17”, “sports” and “baseball” as values for parameters “issue”, “section” and “article” in the template.
Qualys WAS uses this approach for handling parameters embedded in path components. It takes an ordered set of URI templates as an input. Each URL is matched against the URI templates provided by the user. The parameter names and values are extracted from the URL. These parameter names and values are considered to be injection points for the scanner for testing.
How Qualys WAS Uses URI Templates
Now let’s demonstrate how we can solve the two problems using these templates:
http://www.example.com/articles/issue/{issue}/section/{section}/article/{article}
http://www.example.com/search/keywords/{weather}/{redwood}/{shores}
In the discovery/crawling phase or from the REST endpoints explicitly provided by the user, WAS will have a set of links such as the following.
- http://www.example.com/articles/issue/17/section/sports/article/soccer
- http://www.example.com/articles/issue/17/section/sports/article/baseball
- http://www.example.com/articles/issue/17/section/sports/article/tennis
- http://www.example.com/articles/issue/17/section/technology/article/appsec
- http://www.example.com/videos/nature/pandas/action/watch
- http://www.example.com/search/keywords/weather/redwood/shores
- http://www.example.com/preferences/temperature/celsius/
- http://www.example.com/articles/issue/19/section/technology/article/hacking
- http://www.example.com/articles/issue/21/section/business/article/qualys
- http://www.example.com/articles/issue/24/section/business/article/google
Links 1-4 and 8-10 above match the first template: http://www.example.com/articles/issue/{issue}/section/{section}/article/{article}
Meanwhile, link 7 matches the second template: http://www.example.com/search/keywords/{weather}/{redwood}/{shores}
Based on a discovery that multiple links match the same template, WAS may optimize the process by testing only some of those links, and not all them. For example, WAS can be configured to test only three links when it encounters this type of scenario, which in this case would mean testing links 1-3 and skipping the other four.
At the testing phase for the links 1-3, WAS will identify “issue”, “section” and “article” to be parameter names and “17”, “sports” and “soccer” as corresponding parameter values and will fuzz those values.
Now let’s assume we want to test the link 1 for a SQL injection vulnerability with the payload “17 or 1=1” or equivalently with URL encoding 17%20or%201%3D1.
Knowing “17” to be the injection point, WAS will generate a test with the following URL:
http://www.example.com/articles/issue/17%20or%201%3D1/section/sports/article/soccer
With this approach, the “issue” parameter will be effectively tested for SQL injection vulnerabilities.
Summary
In summary, finding the injection points within the “pretty” links in web applications is not trivial. However, with proper user input, the scanner can be efficient and produce accurate results. For an input format, WAS uses URI templates, which help WAS to formally define parameters to be tested in a concise manner, and to recognize and efficiently test them with high accuracy.