Web application vulnerability scanners are big business. A quick search of alternatives will show you that there are literally hundreds of open source and commercial scanners, and all of them offer varying coverage of the vuln space as well as functions that extend into different phases of the Pen Test Kill Chain. As is the case with any trend in security, this explosion in the market is a symptom of something else entirely – web applications are by their very nature easy to access and popular for hackers to exploit. The payoff for a successful breach or compromise is massive.

Most major companies involved in cyber security solutions publish annual reports that summarize the past year’s events and make predictions of the trends we can expect will shape the business in the years to come. Verizon, Dell, Cisco, FireEye, Symantec, and HP are just some of the more anticipated releases each year. While they do not usually deal in deep technical details and Tactics, Techniques and Procedures (TTPs) of specific threats, they do help shine the light on prevailing threat areas, high-level delivery modes and the perceived motivation for any of these changes. Many recent findings have noted the shift of attackers towards the exploits of user browsers and plugins, as well as the advent of “malvertizing.”

These trends point to a couple of things. First, users, and by extension their end-user devices (mobile, laptop, home) are the weakest link in our security story. Second, as those same users demand anytime, anywhere access, it becomes difficult to secure that elastic perimeter, as there is no longer a perimeter to defend but a very fluid environment. Hackers by their very nature are going to exploit the trust we all put in these new shifts in behavior and corresponding weaknesses – hacking the enterprise is more involved and delicate at this point in time, so depending on their motivation, the get-rich-quick set is going to shift to the path of least resistance – web applications and the application-client paradigm.

So what are our customers doing in response? They are starting to purchase tools that can help, but only if wielded correctly. In Kali Linux, we can run some fantastic and well-regarded scanners that we can use to help our customers better understand their exposure to vulnerabilities and better understand their architecture. While we have some great options, it is helpful to mix them to ensure that we are covering vectors with multiple tools that can complement each other and reduce our blind spots. One of the best such tools, and one I am sure you have used in the past if you have conducted web application vulnerability scans, is Arachni. It can be easily added to your Kali instance by downloading it from the site here.

Walking on Spider Webs

Arachni is an open source scanner that focuses on the recon phase of our penetration testing in a different manner than any other tool out there. If you’ve used Arachni without paying attention to what makes it different (like me), then you may find that changing your workflow will greatly improve results. The creator of the tool, Tasos Laskos developed the tool to address a couple of opposed goals. First, scans can take an excessive amount of time (many hours to even weeks) and as such makes these scans less-than-helpful.  That time is lost and makes testing a more drawn-out process. Second, more data and coverage is a good thing, as it enhances accuracy, but it also adds additional time to the test process to complete the necessary transactions.

Laksos developed Arachni to both reduce the amount of time for a scan while allowing the tool to scale such that it is able to process more test vectors efficiently. The timing was improved by employing asynchronous HTTP requests to do the tool’s bidding, and by allowing Arachni to scale across a cluster of systems so that they are all processing in parallel. The accuracy goal was enhanced by open-sourcing the tool and using a Ruby framework that can allow anyone to add new or better-optimized tests to the tool. Both timing and accuracy & coverage of the testing are further improved by machine learning techniques that allow the scanner to hone testing vectors used through the results from earlier vectors in the test battery. Together, these enhancements make Arachni a formidable scan tool that we should all dive deeper into for improved testing efficacy and timing.

Optimal Arachni Deployment Tips

When we are deploying Arachni for practice use, we would likely invoke a single server  and client hosted on the same Kali Box. The Web UI client executable should be running to ensure that Arachni is being operated in the same manner as it would be in a real testing scenario. A command line option is available, but tends to be limited in scale and works best in single server deployments. The Arachni high-level architecture can be seen below:

B03918_04_01

Arachni’s architecture is its secret sauce – it helps with scale and speed alike.

The brains of the operation is the Web UI Client (arachni_web). It provides the single point of contact that will contact its grid of Dispatch Servers (Dispatchers) with initial scan requests. The Dispatchers then spawn a new Instance on that Server.  At this point, the Web UI Client will be able to communicate directly with the spawned Instance to configure it for the appropriate scanning jobs. When an Instance is done with its task, The Web UI Client pulls the data and stores it, while the Instance simply goes away, returning the resources it consumed to the operating system for future Instances or other software tasks altogether.

As we move into production testing, we can scale the deployment to span additional resources on other servers in our grid, and this will not only allow us to run multiple scans, but accommodate teams of users while consolidating the data gathered in a central database on the Web UI Client. SQLite3 is the default, but if you will ever turn this installation into a grid or large scale installation, it is highly recommended you start with PostgreSQL instead.

Once you have selected one database or the other, you are locked in, and changing your mind means losing all of your prior data. It is worth the effort to build out the PostgreSQL environment up front if you think it will go that way.

Arachni differs from other scanners in that it makes heavy use of asynchronous HTTP to conduct its scans. This allows each Instance to send HTTP requests to targets in parallel. Because you can have multiple scans in progress at the same time, it greatly improves the speed with which we can go from intitiation to completion. Most other scanners spend considerable time waiting on scan threads to finish. This is made more aggregious given that these scanners often run a set array of scans, and do not tailor their scan list on the fly like Arachni.

The number of Instances that can be supported is really going to depend on the number of servers available, their resource availability, and the bandwidth available outbound to initiate scans. I would caution that if you are worried about bandwidth, it is likely that the generated traffic exceeds non-trivial amounts and should be throttled down to avoid impairing the application’s performance for real users or alerting the security operations folks. Nothing says “I’m here to damage your application” like a distributed denial of service (DDoS) attack from your application scanner. As they say, with great power comes great responsibility!

An Encore for Stacks and Frameworks

Our first step in any active recon could very well just be to attempt a scan, but it’s a good idea before taking on any job or task to simply browse to the site’s main page first. Using browser plugins like Wappalyzer we can easily see an initial footprint of the website and discover the platform or framework a web application is built on. We’ll start our detailed Arachi best-practices using the Damn Vulnerable Web Application (DVWA).  Let’s see what the browser and Wappalyzer can tell us before we dive into a scan!

As seen in below, DVWA is apparently running on a Linux operating system, employs Apache as the web server, MySQL as the database, and a mix of scripting languages are employed (Python, Perl, Ruby, PHP). This very much looks like your typical LAMP stack. We’ll craft a scan that realizes these details and helps to narrow the scan time. Other traditional stacks hosted on Windows and Linux/Unix alike using languages like PHP or ASP.NET are giving way to newer stacks based on Ruby, Java, JavaScript and Python. Why? These stacks may specialize in mobile services, super-scalable architectures, and other modern concerns have sprung up. To add to the fun, web stack or framework means something different to everybody. Being able to quickly ascertain what is running, however, can point us in the right direction for Common Vulnerabilities and Exploits (CVEs) and other characteristics we can leverage in our testing.

B03918_04_02

Scans are more productive when we let our browser help with a quick footprint of the target.

DVWA also uses OpenSSL. After the last few years this is well worth a look, as it has been subjected to several high-profile vulnerabilities like Heartbleed that dominated news outlets worldwide. It is always pretty impressive for something so technical to capture headlines, but we’re after even the less glamorous or obscure. Less famous exploits  often go unaddressed in lieu of higher-profile vulnerabilities, but are easy enough to take advantage of.

For some good insight into the stacks running on popular sites, you can learn more about them here: https://stackshare.io. If you want to see just how many stacks and perspectives there are, this Reddit thread is educational and impressive: https://www.reddit.com/r/webdev/comments/2les4x/what_frameworks_do_you_use_and_why_are_they/

Arachni Test Scenario

The DVWA we’re using is included in the OWASP BWA image, which in this lab is located at https://172.16.30.131/dvwa/.  Our Kali box is on the same subnet (172.16.30.133), and we’re interested in shortening the scan time over the default profile. We’ll use this very simple topology to show off some of the advanced moves Arachni can make with a little bit of additional effort and input over a base scan.

Profiles of Efficiency

Most pen tester’s early experience with Arachni usually involves running scans with default settings, which run a comprehensive list of threat vectors past the target environment. This is a great way to see what Arachni can do, but the OSINT gathered in the recon phase or even through browsing, as we’ve just seen, gives us some great information that can help us narrow the search field. We can leverage this information to craft a custom profile, which as far as Arachni is concerned is where most of the bells and whistles are located.

Why should we do this? Eliminating vectors we know not to be of interest (unused database tests, operating system vectors, etc.) can both reduce the scan times and avoid crushing the target with excessive requests. That information can also steer us toward deeper investigation at lower speeds as well. We’ll walk through this process together and identify some options worth considering.

Creating the New Profile

Let’s build a profile for systems running one of the most common stacks. The LAMP stack we see running on the DVWA site still runs a majority of web applications, with Windows/IIS/SQL/ASP.NET a close second, so this is a good profile to have on hand. We’ll want to click on the “+” button in the Profile Menu at the top:

B03918_04_03

Most of Arachni is enabled by default – we can tailor or focus based on Recon & OSINT using Profiles

This will start us at the top of the Profile Navigation menu, located on the left of the browser window (1). We’ll want to decide on a profile name (2), enter a description to help track the intent (a good idea for teams and large production grids), and the users (4) who can access it – I selected “Global.” It is important to note that these profiles are local to the Web UI Client, so your teammates will need to be able to reach the portal from their location:

B03918_04_04

Not the most exciting part, but the naming conventions & documentation make life easier

Scoping and Auditing Options

The Scope Section customizes things a little too finely to be used in a profile that is meant to fit multiple applications. This section includes options to scan HTTPS only, set path and subdomain limits, exclude paths based on strings, or limit the DOM tree depth. If you are conducting a white box test or doing internal penetration testing of the same application as a part of the SDLC, these options are a fantastic way to help focus the testing on the impacted areas rather than cast so large a net that the scan is prolonged and contains extraneous results. So keeping that in mind, let’s move on.

We’ll next set our high-level scanning strategy with Audit options. When we are testing a server, our scanning can be made much more efficient if we omit resource intensive elements that delay our progress unnecessarily. If we have a site we know to be using HTTP GETs and POSTs equally, we can just test with one of those queries. Testing processes are different for all of us – we may overlap our Arachni scans with Burp Suite or OWASP ZAP scans and no longer need to scan headers or process cookies, and as we’ll be covering those in the next chapter, we’ll uncheck that box here as well.

B03918_04_05

The Audit Section is where we decide how many aspects of the web server get tested

Converting Social Engineering into User Input

Depending on our success in the Recon phase, some of our OSINT may include credentials for a web user.  These can often be scraped in a MITM attack similar to what the Systems Engineering Toolkit can help you do. While it is certainly acceptable to scan sites without credentials, the revelations that may come with a scan as a logged in user never disappoint. Without the credentials, you’ll likely see a majority of static content and much less sensitive information exposed in black box tests. White box tests can better serve to secure data in the event of compromised credentials and thus input strings here are just as valuable if they are available. The HTTP section below likewise can help tailor the HTTP requests to better mimic a legitimate client, work through any proxy in the path, and also throttle some of the request rates and file sizes to ensure neither the scanner nor the target are overwhelmed, or in their case, alerted to our presence. As we’ll see, the simple fields entered here will result in a much greater scanned area as we dive deeper into the application than without it.

There are a host of other fields we may need to be aware of as well. Addresses, names, and other static information are popular, but more dynamic input processes like captcha fields are built specifically to prevent non-human form fills. This can limit the scanner’s reach, but rest assured that there are many other ways to overcome this obstacle that we can test to ensure the target is truly protected.

B03918_04_06

The Input section is very handy when we have a user’s potential credentials to leverage

Fingerprinting and Determining Platforms

The Fingerprinting section is one of the most impactful ways we can focus our scans and win back time in our schedule. Fingerprinting itself is a fantastic tool, if you don’t already know your target’s stack. As we discovered earlier with our browser, we’re running LAMP and can omit the fingerprinting, while selecting tests that only pertain to that framework. Note that I am letting it attempt to test against all 4 detected languages (Perl, PHP, Python, and Ruby):

B03918_04_07

Fingerprinting allows us to explicitly select tests – no sense using them all if we don’t have to

Checks Please

The Checks section is where the nitty gritty details of what vulnerabilities we’ll test against are offered. Each of the checks is either an active or a passive check, and we can enable all or some of them depending on the need. In this case we have plenty of info on the DVWA’s framework, but we have no clues as to what they will be subject to. If this were truly a black-box scan and we were trying to stay covert, I may omit all active checks in earlier scans and wait until I am able to conduct this from a less-suspicious host closer to the web server.

B03918_04_08

Active Checks dictate what vulnerabilities Arachni will interact with the target proactively to discover

B03918_04_09

Passive checks are collected without interacting with input fields on the application

Plugging in to Arachni Extensions and 3rd party add-ons

Arachni’s Ruby-based framework (because surprise! Arachni is a web application too!) makes it an easy tool for anyone to build additional capabilities for. Widely accepted plugins, when meeting the developer’s high standards, even make it into the Web UI Client in the Plugin Section. In this version of Arachni, we have several plugins that have been crafted tune our scan to be a better citizen (Auto-Throttle or Rate-Limiter) scan for additional quirks of interest in timing and responses, and even allow modification of fields and test parameters on the fly. Some of these are simply checked, others require additional configuration after expanding them (simply click on the hyperlinked name of each plugin for details). I selected AutoThrottle and WAF Detector for this profile:

B03918_04_10

Plugins modify any profile to allow on-the-fly parameter fuzzing, rate limiting, or other special tests

An alternative way to conduct WAF detection is to use our trusty nmap tool. Nmap includes a couple of scripts for that purpose. One simply detects the presence of one, while the other, if a WAF is found, will typically tell us what brand or version is in use. Confirming through multiple scripts and tools can often increase the confidence in overall test fidelity. To use these scripts, enter the following commands:

nmap –p 80,443 --script http-waf-detect <hostname or ip address> nmap --script=http-waf-fingerprint < hostname or ip address>

This is Browser Control to Major Tom

Who doesn’t love a good David Bowie reference.  Okay, back to the work at hand. All of these scans and tests are conducted as if the Instances are actually browser sessions interacting with the web application and user. We can actually modify the browser’s emulated behaviour, time limits, size, and even the number of independent browsers in use. For quick scans, I advise checking the box that allows the scanner to omit images.

Session checking is also a potentially valuable capability. There is nothing worse than scanning an application just to find out that the logic behind that application has shunted your sessions, scans, or traffic to a linked domain outside the target’s span of control. We can enter confirmation pages here to ensure that we periodically check back in with an anchor page to determine our state and ensure we’re still in the right application’s structure.

B03918_04_11

Browser and Session settings ensure we present our best side to the right site

After we’ve made all of our profile changes, we can select the Create Profile button at the bottom and we should be good to go! If you run into issues, the salmon-colored error box will steer you right to where the issues lie and tell you what must be revised before the profile can be accepted.

Kicking Off Our Custom Scan

Once we’ve made our custom profile, we can invoke it, schedule it, or put it away for safe keeping.  To run the scan, simply initiate a scan and select the LAMP Profile (or whatever name you used) in the “Configuration Profile to use” field. We can control the number of instances, how the workload is provisioned (Direct to local computer running 1 or more instances, to a remote server, or using a Grid of computers) and schedule it if so desired.  Here we’ll just run it with 2 instances now and see how it performs.

B03918_04_12

Running a scan with a custom profile is straightforward

B03918_04_13

The LAMP profile is slower, but revealing more and at lower burden to the target server

Reviewing the Results

The results of our tailored scan are that we see 34 issues vs. the 29 identifies in the wide-open scan before, but that our scan was 28 minutes vs. 34 seconds.  We also managed to unearth over 90 pages vs. 27 (some as a result of our user input) and did so at a much lower request rate than the default, thus protecting the DVWA from a mini Denial of Service (DoS) and staying under the radar of any defenses that may have been un the way. Arachni archives scans until we run out of space or tell it to delete them, so we’ll always have both to compare and contrast. In practice, I would recommend running a single targeted scan when the risk is acceptable and running smaller sub-scans for more covert phases of your work.

Here is a summary of these results:

LAMP Profile Default Profile
Time to Complete 38 minutes 36 seconds
Pages Found 90 27
High Severity 3 0
Medium Severity 8 7
Low Severity 6 6
Informational 17 16
Total Issues 34 29

The reports can be exported in a variety of useful formats, with HTML, JSON, XML, Marshal, YAMR, and AFR all available. As with all of them, the HTML reports are data rich and near-presentation ready:

B03918_04_14

Arachni’s report formats are production-ready and a great start on building your deliverables to the customer

B03918_04_15

Detailed Vulnerability information can help quickly educate the customer or set up the next phases

Summary

Effective penetration testing is contingent on many things, but the best results come with a well-educated set of scans that can both educate as well as set the table for the latter phases of the test. As current events have revealed, the sheer number of vectors available for attackers to leverage demands comprehensive test suites that automate scans and help us quickly ascertain the risk exposure of a target. Almost as important is the ability to process the copious amounts of raw data and turn it around as actionable intelligence.

Arachni is a fantastic tool to provide these scans, and as an added bonus can form the basis of our detailed reports or deliverables. Because Arachni is an open source and extensible product, it is well supported by the community and offers an Ruby framework on which anyone can graft their own extensions or plugins. There is literally thousands of options Arachni presents, but hopefully this dive into the profile builder offers some insight as to how to better employ Arachni and other complimentary tools in your browser or nmap to scope out our target systems efficiently.

In the next couple of posts, we’ll take a look at two more tools that overlap with Arachni but take the pen test further into the Kill Chain – Burp Suite and OWASP ZAP. These tools replicate some of Arachni’s functions, but can also leverage what we have learned here to begin actually exploiting these holes and confirm the impact. After seeing what the DVWA has to worry about, I am sure we’ll see some interesting results!