I’m trying to setup GCS-hosting for datadocs. I know there’s a tutorial for AWS. I saw GCS support mentioned a few times in the docs, but couldn’t find a tutorial. Could anyone point me towards a good place to start?
A few things:
-
You’ll need credentials configured correctly
-
You’ll need to configure a data docs site as follows in your great_expectations.yml:
data_docs_sites:
gcs_site:
class_name: SiteBuilder
store_backend:
class_name: TupleGCSStoreBackend
bucket: YOUR_GCP_BUCKET
prefix: OPTIONAL_PREFIX
project: YOUR_GCP_PROJECT
site_index_builder:
class_name: DefaultSiteIndexBuilder -
You’ll then probably run into this bug that I’m fixing now: https://github.com/great-expectations/great_expectations/issues/1393
-
Then you may notice that some of the links between data docs pages don’t work. I will file this and begin work on these bugs as well.
-
Once these bugs are worked out I plan on making a “how to” guide in our official docs.
How did you configure credentials?
I presume via gcloud auth
on the command line. I’m not sure how permissions are typically saved in GE, but GCS commonly relies on service accounts that you can authenticate as JSON key files. Service accounts in GCS are specially made accounts designated for programmatic usage of a specific task.
So, the nice thing about GE in this case is that it doesn’t even know about your credentials - it uses the google-cloud
library service accounts. You might need to create a service account, download the key, and set the environment variable like this:
export GOOGLE_APPLICATION_CREDENTIALS=path/to/sevice-account-key-sdfjwefsdf.json
Side note. The first bug is fixed and merged, and a colleague has fixes for the other bugs I’m hoping to ship tomorrow!
Alright I have some good news! In the upcoming 0.10.9 release which is shipping this morning GCS data docs is verified working with a small caveat!
The caveat is that if you have a prefix
configured your site will not have the correct urls so until this bug is fixed you will need to operate with a prefix: ""
: https://github.com/great-expectations/great_expectations/issues/1398
0.10.11 also has related fixes.