Does any Expectations exist to compare 2 different partition in same batch?

Hi All,

DataSource : Spark
Hosted Environment : Databricks

Use Case : I have an expectation to compare 2 months of data in same batch by counting total number of stores and the percentage of difference should be within +/- 5%.

More precisely, In the attached screenshot, I’ve a batch holding 2 months of data. I need to count total number of stores in each month and then find variance % by applying the below formula,

Screenshot 2020-11-19 140428

[ (latest month store count - previous month store count ) / latest month store count ]

The % variance should be within +/- 5% threshold, If it doesn’t meet the threshold value, the Expectation must be marked as failed. Any thoughts would be greatly appreciated !

Thanks,
Karthik

1 Like