CSVDiff Tool
The supplied CSVDiff tool (csvdiff.py) provides the TestHarness the capability to perform differentiations with comma separated value (CSV) files.
Basic Usage
In it's simplest behavior, performing a differentiation on two CSV files (a and b) requires the following syntax:
If the two files are the same, the program will state that it is so, and exit with return code 0. Example:
A detected difference will be stated, and exit with a non-zero error code. Example:
Extended Usage
The CSVDiff tool can be used to test specific fields with specific error tolerances. It can also be made to detect when not to perform a differentiation if the value being tested is below a certain threshold (floor, or zero). These features can be used as direct arguments to csvdiff.py, or through the use of a comparison file.
Syntax
Always specify the two files you wish to perform a differentiation on, before any other options
Arguments | Value | Help |
---|---|---|
--summary or -s | csv_file | Create a comparison file based on csv file |
--comparison-file or -c | comparison file | Use specified comparison file while performing differentiations |
--ignore-fields | str | A list of space separated fields to ignore when performing differentiations |
--diff-fields | str | A list of space separated fields to include when performing differentiations |
--abs-zero | str float | A scientific notation or float value representing zero (the floor). Any values lower than this amount will be considered zero. (default: 1e-11) |
--relative-tolerance | str float | A float or scientific notation value representing an acceptable degree of tolerance between two opposing values. Any float comparison which falls within this tolerance will be considered the same number. (default 5.5e-6) |
--custom-columns | str | Space separated list of custom field IDs to compare |
--custom-abs-zero | str float | Space separated list of scientific notations or floats for absolute zero, corresponding to the values in –custom-columns |
--custom-rel-err | str float | Space separated list of scientific notations or floats for relative tolerance, corresponding to the values in –custom-columns |
Comparison File
Using a comparison file is ideal when needing to adjust a complex set of fields and tolerances, which would make for a very long and confusing command line argument. The CSVDiff tool can generate this comparison file which, can be used to set the above arguments quickly.
To generate a comparison file, run the CSVdiff tool with the appropriate --summary csv_file
argument. In the following example, we use echo
to create a simple csv file. We then instruct csvdiff.py to create a comparison file from our csv file, and redirect the output to a file named a.cmp
:
You can then edit this file and modify key sections to control tolerances, or instruct csvdiff to ignore an entire field all together.
The 'TIME STEPS' field is a special header, which currently is not used and is present for future capabilities yet to be added to the CSVDiff tool.
The 'GLOBAL VARIABLES' field allows you to change the tolerance for every field present in the CSV file. There are two key parameters; relative and floor. You can modify one or both of the values immediately following the parameter to suite your needs. You can also modify the tolerances for each individual field. In the case of our example, 'x' is the only field in our CSV file. To adjust only that field's tolerance values, we can add a parameter directly proceeding the 'x' label:
The above change does nothing, as the relative error value we added is the same as the global relative error value. You can also add both relative and floor tolerances to this line. As well as comments and other logical statements:
Here we added comments, loosened both the floor and error tolerances for field 'x'. Field 'y' will be ignored entirely. The 'z' field we left alone, and will end up using the global values set forth by the global variables header line.
A Real Example
Consider the following two CSV files:
File a:
File b:
We purposely altered the field header to demonstrate CSVDiff's capability of correctly mapping the field labels between two files.
If we run csvdiff.py on a and b, we see there is a small difference of 5.501e-06 for field 'y' at time step 1 (or simply put, row 1). Just a bit more than what our default global tolerance allows for:
We can create a comparison file to set forth new tolerances which will allow the two files to be considered identical. Start off by creating a comparison file using file 'a':
The following example changes would allow both files to be considered identical:
Loosen the error tolerances for 'y':
Raise the floor tolerance:
Ignore field 'y' by including a not '!' statement:
Removing the offending field from the comparison file:
Any one of the above example comparison files, would allow a and b to be considered identical:
The summary report follows the same output style as another popular tool: exodiff -summary
. By design, the two summary reports are interchangeable.