Using Python and XPath to Extract Report Data

VectorCAST reports contain all of the information you need to see how your test cases have run. But in some cases it may be desirable to pull out specific portions of the reports for use in external tools. For example, Jenkins has the ability to gather and report specific data metrics from job runs. While VectorCAST does not currently have an API to customize or gather individual report metrics, VectorCAST's HTML reports have a consistent and stable structure that can be utilized by external scripts to gather output data. 

Here is an example of such a script, written in Python and designed to pull the "Overall Results" data from a VectorCAST/C++ Management Report:

---- Begin: ----

# This example uses the lxml library for its XPath implementation
# Other XPath tools can probably be used, but may not be able to automatically
# clean up and create a DOM from imperfect HTML
from lxml import etree


tree = None

# Open and read report file (can be converted to use an argument from sys.argv
with open(REPORT, "r") as f: 
    text =
    tree = etree.HTML(text)

# Gather raw results. This relies on the structure of the reports being 
# fairly static, which has been the case for a very long time. 
raw_results = tree.xpath("/html/body/table[%d]//table//td//text()" % OVERALL_RESULTS_TABLE)

# Drop first two results (column headers) and convert   characters 
# to "normal" spaces
results = [s.replace(u'\xa0', ' ') for s in raw_results[2:]]

# Convert results to a dictionary (for example)
r_dict = dict(zip(results[0::2], results[1::2]))

# Output values in whatever format you like
for k,v in r_dict.iteritems():
    print "%s%s%s" % (k, ' '*(24-len(k)) , v)
---- End: ----

The lxml package is not installed in Python by default, so you will need to add that package or find one with equivalent functionality. Other XPath tools can also be used with the query above with similar results.

