Hi,
I am using great expectations. My source data has only 97 records while reference data has 5 lakhs records. I need to compare whether my source dataset is a subset of reference dataset. I am using the rule label ‘expect_column_values_to_be_in_set’ and I am running great expectations on databricks. The rule execution takes around 55 minutes to give the result. Can someone please help on how I can optimize the performance and reduce the time.