Aleksei
October 26, 2023, 12:45pm
1
Hello, community.
I want to raise the topic of a modern stack for GX in production
As I see for production using:
A fluent data source is better than a block config data source
Checkpoint is better than a validator
Checkpoint is better than SimpleCheckpoint
Data assistance is better than rule-based profiler
Thoughts?
Hi Aleksei,
Thanks for raising such an important topic. To answer that question, I think it would be quite important to map these concepts to the related documentation and the steps along the GX workflow. I have to review the new updates to the docs and I’ll share any insights in this thread.
1 Like
Aleksei
November 15, 2023, 8:00pm
3
@CesarGarcia thank you! Yes, will be great to point out best practices in GX docs. But before this, it needs approval from the community and GX.
Aleksei
November 15, 2023, 8:01pm
4
Add some points about expectations
json as expectations storage is better than code
Any news on this topic? We are currently exploring the possibility to deploy and run expectations on Kubernetes. The setup in mind would be
Build Docker image with dependencies
store great_expectations.yml as a ConfigMap and use Kustomize to manage enviroments
Airflow to trigger the checkpoint run
use sidecar pod with access to expectation suites
We are currently struggling with how to incorporate this process with CI/CD. And what the development process will look like. Any idea?
Is there also a best practice for running always the same actions for all checkpoints?
Anything else to look out for? Anything missing?