Hello,
I am using a checkpoint with “result_format”: “COMPLETE” option in order to use the complete JSON for custom python flagging row by row issues with a unique identifier (in this case point_id).
The issue I am having is that I also want to create a validation result HTML summary page (/dbfs/mnt/gx/uncommitted/data_docs/local_site/validations/exp_suite/20240801-160244-gx-run-exp_suite/20240801T160244.027774Z/memory_datasource-exp_suite.html) built as well and I cannot find an option that stops the action store_evaluation_params from writing all the COMPLETE failed checks from writing out to my GX data context. To make matters worse I am on databricks using a service principle to connect to Azure where my GX context is located (making checkpoint runtime performance VERY slow [essentially unusable] when this is happening).
Since I could not find a way to stop this from happening I attempted to write a custom action for store_evaluation_params but unfortunately due to issues with the service principle it does not appear GX can locate my plugins or custom_actions on the Azure blob location compared to when they are located locally on dbfs. I am able to read the rest of my GX context just fine but the python files with custom actions within the plugins directory is never imported and instead indicates the module cannot be located. I have set the path in gx.yml and defined the modules while also appending it to sys in python without any luck.
I am wondering if:
- MOST PREFERRED OUTCOME: Is there a way to generate a validation result summary HTML page while having “result_format”: “COMPLETE” set but without having the store_evaluation_params write out all the results to my GX context?
OR
- Is there is better way to implement a custom action adding this modification to the validation result outside of the dependency around locating the plugins directory?
My checkpoint:
checkpoint = Checkpoint(
name=checkpoint_name,
expectation_suite_name=expectation_suite_name,
data_context=context,
run_name_template=f"%Y%m%d-%H%M%S-gx-run-{expectation_suite_name}",
validations=[
{
"expectation_suite_name": expectation_suite_name,
}
],
action_list=[
{
"name": "store_evaluation_params",
"action": {
"class_name": "StoreEvaluationParametersAction"
}
},
{
"name": "update_data_docs",
"action": {
"class_name": "UpdateDataDocsAction"
}
}
],
runtime_configuration={
"result_format": {
"result_format": "COMPLETE",
"unexpected_index_column_names": ["point_id"],
"return_unexpected_index_query": True,
},
},
)
My custom action:
class CustomStoreValidationResultAction(StoreValidationResultAction):
def _run(
self,
validation_result_suite: ExpectationSuiteValidationResult,
validation_result_suite_identifier: ValidationResultIdentifier,
data_asset: dict,
payload=None,
expectation_suite_identifier=None,
checkpoint_identifier=None,
):
# Create a new ExpectationSuiteValidationResult with modified results
modified_results = []
for result in validation_result_suite.results:
modified_result = result.to_json_dict()
if "unexpected_index_list" in modified_result.get("result", {}):
del modified_result["result"]["unexpected_index_list"]
if "unexpected_list" in modified_result.get("result", {}):
modified_result["result"]["unexpected_list"] = ["<removed for storage>"]
modified_results.append(modified_result)
modified_result_suite = ExpectationSuiteValidationResult(
results=modified_results,
success=validation_result_suite.success,
statistics=validation_result_suite.statistics,
evaluation_parameters=validation_result_suite.evaluation_parameters,
meta=validation_result_suite.meta
)
# Call the parent class method with the modified results
super()._run(
validation_result_suite=modified_result_suite,
validation_result_suite_identifier=validation_result_suite_identifier,
data_asset=data_asset,
payload=payload,
expectation_suite_identifier=expectation_suite_identifier,
checkpoint_identifier=checkpoint_identifier,
)
I also noticed this GX docs page for Configure Actions is down - https://docs.greatexpectations.io/docs/oss/guides/validation/validation_actions/actions_lp/
Any ideas are appreciated!