In GX 1.5.7, my ExpectTableRowCountToBeBetween
expectations seem to be ignoring their row_condition
s. Inspecting the results, the observed_value
in each case is the entire table row count, not just the rows satisfying the condition.
Is this expected? expect_table_row_count_to_be_between
is not included in the list of limitations for using row conditions.
I have tried setting the condition_parser
to both "spark"
(which worked perfectly in GX 0.18) and to "great_expectations"
(as recommended by the newer documentation), but in both cases the condition is not applied.
Here are some example expectations:
{
"type": "expect_table_row_count_to_be_between",
"kwargs": {
"min_value": 99,
"max_value": 101,
"row_condition": "col(\"Cohort\") == \"MyCohort\"",
"condition_parser": "great_expectations"
},
"meta": {
"notes": {
"format": "markdown",
"content": [
"Expect row count for **Cohort == MyCohort** to be within **1%** of the row count seen in the most recent successful validation run"
]
}
},
},
{
"type": "expect_table_row_count_to_be_between",
"kwargs": {
"min_value": 198,
"max_value": 202,
"row_condition": "col(\"Cohort\") == \"MyCohort2\"",
"condition_parser": "great_expectations"
},
"meta": {
"notes": {
"format": "markdown",
"content": [
"Expect row count for **Cohort == MyCohort2** to be within **1%** of the row count seen in the most recent successful validation run"
]
}
},
},
If my data file has 100 rows for MyCohort and 200 for MyCohort2, both expectations fail with an observed value of 300.