Hi
background:
I am running GE in hosted environment using this - DataContext created during the run.
I have two different Jobs - one for Validation (daily) and another for Profiling(weekly/monthly). I am using same S3 bucket for the expectation and validation stores but with a different prefix. I am building data_docs for every job (table) run.
For e.g.,
validation.py - parameters
stores={
"expectations_S3_store": {
"class_name": "ExpectationsStore",
"store_backend": {
"class_name": "TupleS3StoreBackend",
"bucket": "s3_bucket",
"prefix": "validation_expectation_prefix",
},
},
"validations_S3_store": {
"class_name": "ValidationsStore",
"store_backend": {
"class_name": "TupleS3StoreBackend",
"bucket": s3_bucket,
"prefix": "validations",
},
},
"evaluation_parameter_store": {"class_name": "EvaluationParameterStore"},
},
profiling.py - parameters
stores={
"expectations_S3_store": {
"class_name": "ExpectationsStore",
"store_backend": {
"class_name": "TupleS3StoreBackend",
"bucket": "s3_bucket",
"prefix": "profiling_expectation_prefix",
},
},
"validations_S3_store": {
"class_name": "ValidationsStore",
"store_backend": {
"class_name": "TupleS3StoreBackend",
"bucket": "s3_bucket,
"prefix": "profiling/results",
},
},
"evaluation_parameter_store": {"class_name": "EvaluationParameterStore"},
},
Issue:
Everything runs as expected when I run Validation or Profiling. But, when I run Profiling after Validation, all the Validation results (html files & index.html) get replaced with Profiling results. Similarly, if I run Validation after Profiling, all Profiling results get replaced with Validation results. This happens just not with the same datasets, but also with completely different dataset for Validation & Profiling.
Please suggest on what I can do to keep both Validation and Profiling results, and be able to combine them into the same index.html. When I run everything from my local machine, it does that but some how from the hosted environment (serverless), one overwrites the other.
Thank you for your help.