Unable to run 'row_condition' with spark

When I try to use the below GX code, I am getting the below error. I am using GX using pyspark and databricks. When I try to comment the row condition statements in the expectaton, then we are not seeing the error. Can you please help to resolve.

Code:
expectation_with_condition = gx.expectations.ExpectColumnValuesToNotBeNull(
column=“colA”,
condition_parser=“spark”,
row_condition=‘`colB`=“val”’,
)

Error: [CANNOT_RESOLVE_DATAFRAME_COLUMN] Cannot resolve dataframe column “colA”. It’s probably because of illegal references like df1.select(df2.col("a")). SQLSTATE: 42704

When I remove the row_condition as below, error on colA does not appear
Code:
expectation_with_condition = gx.expectations.ExpectColumnValuesToNotBeNull(
column=“colA”,
)

1 Like

Hello! Same happens to me, but with different expectation:
gxe.ExpectColumnMinToBeBetween(
column=“len_not_already_purchased_recommendations”,
min_value=50,
max_value=50,
condition_parser=“spark”, row_condition=‘col(“parameters.algorithm”).notNull()’),

ERROR:
“exception_message”: “[CANNOT_RESOLVE_DATAFRAME_COLUMN] Cannot resolve dataframe column "len_family_activation_recommendations". It’s probably because of illegal references like df1.select(df2.col(\"a\")). SQLSTATE: 42704”,

Thank you for you help!

I don’t find GX 1.0 doc on this but in the older version 0.18.x , for spark the condition_parser should be “great_expectations__experimental__”. I’m wondering if this has changed for GX 1.0?

I’m trying to use this: Apply Expectation conditions to specific rows within a Batch | Great Expectations
Should I use great_expectations_experimental? Or maybe GX will be fixed soon for this function?

Yes please give it a try

Same issue :frowning: not working with exprimental either with GX

gxe.ExpectColumnPairValuesAToBeGreaterThanB(
                column_A = 'MAX_SYS_DATE_TOLERANCE', column_B = 'FIFTH_WD', or_equal = True,
                condition_parser='great_expectations__experimental__',
                row_condition='col("FREQ")=="FIFTH_WD"',
               
            )

This is my code above and works. But it’s a different expectation. Few things im thinking that i would try:

  • rename column
  • try different expectations with row_condition

refer to the older doc if required: Conditional Expectations | Great Expectations

Other than not, I’m not sure what is causing the issue

gx.expectations.ExpectColumnValuesToBeBetween(
                            column="Price", 
                            min_value=1, 
                            max_value=4000, 
                            row_condition='col("City")=="Ibadan"', 
                            condition_parser="great_expectations__experimental__"
              )
```´
This is my code, I have tried using 'great_expectations' and 'great_expectations__experimental__', none worked.