Hi fellow GE enthusiasts,
I would like to get the set of failed records for a given batch validation on an expectation suite. I’m currently working with an Athena datasource as the data is in S3, but am curious if switching to a Spark source would help.
The ideal situation is to read a dataset in from S3, run validations on it, and split the data into failed/passed datasets to write back to S3.
Thank you, your advice is appreciated!
Kevin