Hi everyone. I am facing some issues to validate a column. I am looking for an expectation that allows only specific value_set in a column.
For example, allowed_values_on_column_x = [ "A", "B", "C" , "D" ]
and then in my dataframe I have the columnX
the values "A", "B", "C"
, and "NOT_ALLOWED_VALUE"
. I expect use an expectation that returns only that:
"result": {
"observed_value": [
"NOT_ALLOWED_VALUE"
]
even if is missing the value "D"
.
I tried by using ExpectColumnDistinctValuesToContainSet • Great Expectations , ExpectColumnDistinctValuesToEqualSet • Great Expectations and ExpectColumnDistinctValuesToBeInSet • Great Expectations.
Also, the expectation catches the "NOT_ALLOWED_VALUE"
BUT catches as well some allowed values , for example "A"
or "B"
, since we have them in the dataframe and is allowed.
"result": {
"observed_value": [
"A",
"B",
"NOT_ALLOWED_VALUE"
],
Any ideas what it could be? do I need to write a custom expectation?
Python version: 3.12
“great-expectations 1.4.6”,