Is it possible to see the number of record in pyspark dataframe that didn't pass the validation?

hey there @woodbine welcome to our community

can you try adding unexpected_index_column_names to your result format? If your DataFrame has a unique identifier column (like an ID or record number), specify that column in unexpected_index_column_names. This will include the failing record indices in the validation output.

result_format = {
   "result_format": "SUMMARY",
   "unexpected_index_column_names": ["id_column"] 
}

replace “id_column” with your unique identifier column

2 Likes