Regarding setting up validation and data docs

I am looking at this page and this page to setup validation and data docs storage

Let’s say I do this and after 1 year or so of using great expectations on daily basis if I run the following command

great_expectations --v3-api docs build --site-name gs_site

Will great expectations download everything from gcs to re-build the data docs? I am worried that re-building data docs over time will cause our cost on GCS to increase. Because great expectations will keep on hitting GCS to get all validations from past one year.

I considered setting up postgres for the validation result storage as per this page but this does not tell me what table structure do I need to create in this schema. Will great expectations create the table structure for me when I run this for the first time? Or do I need to run some sql for setting it up? If yes, where can I find the sql to setup postgres.

I would prefer GCS but I don’t want to get into having an ever-increasing cost due to re-building the index page for the data docs.

The data-docs index.html file as it is right now keeps on increasing in size with each run. Would this not become an issue as it keeps on increasing in size? Every time it needs to be generated it would need to know all the history of all runs. This can become an issue at some point - cost wise as well as user experience wise

Hi @aseembansal -

You are correct in how you’re thinking about docs build. It just builds based on everything you have in your validation store. Most people use a limit on the number of validations they want to show though. You can also create a job to periodically archive validations older than a period of time.

It’s a valid option to set up your meta data in postgres. I would suggest following our new guides instead of the legacy docs. Here’s the guide to set up with Postgres. How to configure a Validation Result store to PostgreSQL | Great Expectations