Validate different dataframes with respective expectation suites using checkpoint

Try passing your dataframe to the checkpoint.run() with:

validation_results = checkpoint.run(batch_parameters=batch_parameters)

Do you get the error when using this?

When using dataframes as the input data, the checkpoint run method always requires you to pass in the dataframe that should be validated as batch parameters. This dataframe is then passed on as the batch that is validated. Unfortunately, the Checkpoint can’t save the dataframe or a way to download the dataframe.
My guess is that it also can’t validate both datasets, it will just pass the same dataframe to both validation definitions.

If possible, consider using Databricks SQL instead. That allows you to perform the exact workflow you have here.

I wrote a sample code for how to do that here: GX 1.0 and Databricks - #3 by ToivoMattila
Also GX documentation for that here: Connect to SQL data | Great Expectations

1 Like