Hello, first time poster here so apologies in advance, but here goes…
I’m using the ExpectMulticolumnSumToEqual expectation (with v1.5.6) to check that the sum of some pandas DataFrame columns is equal to 100, for example:
gxe.ExpectMulticolumnSumToEqual(
column_list=["sand_content [%]", "silt_content [%]", "clay_content [%]"],
sum_total=100,
ignore_row_if="any_value_is_missing"
)
Where:
sand_content [%] = 83.45
silt_content [%] = 16.26
clay_content [%] = 0.39
sum = 100.1 %
However, due to compounded rounding errors, the column sum is not exactly 100, but is usually very close (i.e. within +/-0.1). This yields a failure on validation as 100.1 obviously does not equal 100. Is there any way to include an “error tolerance” to allow for some deviation from the exact value?
Similarly, due to binary floating point arithmetic issues (see 15. Floating-Point Arithmetic: Issues and Limitations — Python 3.13.6 documentation ), even when my columns appear to sum to 100 exactly (e.g. sand_content = 84.7, silt_content = 7.5, clay_content = 7.8), python calcuates the sum as 99.99999999999999, and hence yields an expectation failure.
There also used to be the “expect_multicolumn_sum_values_to_be_between” expectation which would be better suited here - is it possible to use these old/legacy expectations with the newer versions of GX?