Sync Your VM Data Fast

Parag Baxi

Last updated on: September 6, 2020

Make your Qualys data your own by synchronizing it locally. Though report templates are an easy way to set up and distribute that data, they are typically not flexible enough to meet the unique requests from unique teams that crop up over time. Synchronizing your Qualys data locally and enabling all teams in your organization to query it locally, will give you the most scalable access to your data.

This will enable the ultimate flexibility in reporting: SQL queries, correlation with other security tools, and finally the ability to build those reports with Comic Sans headers that Jim from the audit team so badly wants. The benefits don’t stop at security use cases, either. They can bring even greater value to your organization: trust in your reporting.

I’m Sold, How Do I Do This?

Qualys offers a multitude of ways to extract your vulnerability management data. By far, the leanest, most flexible, and fastest method is to use the Host List Detection API. The Host List Detection API was precisely created for the use case of downloading all your VM data. It’s already being used by customers today to download vulnerability data from millions of hosts that are scanned monthly.

The best way to download your VM data is to download delta sets continuously, which you can do quite easily. But before that happens, we’ll have to download an initial seed of all your data.

Download Your VM Data Seed

Downloading all vulnerability (and information gathered data) will result in downloading in an enormous set of data, even with the host list detection API. There are three strategies that ensure a quick download:

  1. Download assets in multiple chunks instead of one large call. This will result in quicker spin ups and spin downs from the Qualys platform, and faster downloads.
  2. Download by host ID. Downloading by asset group or ips will results in additional lookups.
  3. Download your data with multiple threads. Your network tubes can hold more than one connection, use them!

Implementing the above is challenging. Lucky for us, I created a free & open source Python script that does exactly this:

paragbaxi/qualysguard_host_list_detection · GitHub

Download speed comparisons of the host list detection API for about 93,000 hosts:

QualysGuard Host List Detection API XML Speed

QualysGuard Host List Detection API CSV Speed

Download your VM Data Deltas Continuously

Now that you have an initial seed of all your data, the most efficient method for downloading your data is by scoping by time. The Host List Detection API has a parameter, vm_scan_since, that enables one to only download hosts that have been scanned & processed since a certain time — this time can be measured down to the second. Let’s step backwards a little to understand when a host is processed.

First, let’s go over how an individual host is scanned. The Host List Detection API’s vm_scan_since parameter scopes host by their processed date. Below are the stages of a host getting scanned:

QualysGuard_Scan colorized

Now that we understand when a host is in scope, it’s important to understand that the entire host is in scope. This means the vm_scan_since parameter scopes at the host level, not at the vulnerability finding level. All vulnerabilities from previous scans will be included (by default) since the Host List Detection API leverages Host Based Findings.

In the following example, both Monday’s QID 90086 and Tuesday’s QID 90252 vulnerabilities (as well as any previous findings) are included in the response:


This enables you to atomically replace your database entry for the entire host, which is quite simple.

How do I do this for all scans?

The Host List Detection API call best serves your needs when requested continuously. This will provide visibility on hosts regardless of when scans complete.

In the example below, the host list detection is called at the top of every hour. The vm_scan_since parameter is dynamically set for an hour earlier.

Host_List_Detection colorized

The 2pm host list detection response call will include:

  • Host A’s first scan
  • Host B

The 3pm host list detection response call will include:

  • Host A’s first & second scan
  • Host C
  • Host D

All set up and automated, what now?

With your VM data now synced locally, the use cases are only limited by the teams you provide this data to. I have seen a customer identify millions of dollars of unnecessary software licensing by leveraging Qualys’s ability to identify software installations. It just so happened the customer had thousands of Office & Photoshop licenses on older, unused workstations that their procurement team had lost track of, which resulted in procuring new software licenses for new employees when they could have reused existing licenses. With Qualys able to identify these applications, their procurement team now performs weekly audits on what software is actually installed in their environment.

Synced Qualys data can also be used to check the accuracy of CMDBs. With Qualys being an agent-less tool, it brings to light the unknown hosts, appliances, and other such devices that you provision. What’s even better, is bringing visibility to the unknown unknown. Perhaps the previous IT Administrator provisioned non-standard compliant devices from a 3rd party vendor, or perhaps you discovered a 4G-enabled laptop sitting in a box in your mailroom that has managed to connect to your internal wireless network. Agents will not find these devices, and programmatically syncing your CMDB will enable your security ops team to take action with business context.

It’s your VM data, and it always has been. The Host List Detection API is just the fastest and easiest way to get it.

Show Comments (2)


Your email address will not be published. Required fields are marked *

  1. Parag

    Great article and thanks for putting it together.

    I may have missed something but if a customer exports the VM data to their own site – what would they store this in?

    Would they need an Oracle database structured similar to QG?



    1. Hi Tony Acharia, and thank you for the kind words. I have seen the data at customers get pushed into an Oracle database, which may be overkill considering the total data size is relatively small. This was for the sake of consistency, as the customer was an Oracle shop.

      I have personally pushed Qualys data into a SQLite db, which surprisingly scaled well. So either extreme works.

      The Host List Detection API can be downloaded in XML or CSV formats. Search for existing libraries to import either of these formats into your database.

      Curious, what use cases do you see yourself solving by syncing your VM data?