We have noticed that there is some confusion around writing to s3 from a Databricks environment (e.g. writing to metadata stores / data docs). That is understandable as it’s a bit involved to set up access to s3 from within Databricks. Please see the Databricks documentation to mount S3 buckets with DBFS. Once the bucket is mounted, you can access files in your s3 bucket as if they were local files.
You can refer to these documents for tips on setting up Great Expectations from within a Databricks environment:
- Deploying Great Expectations in a hosted environment without file system or CLI
- How to instantiate a Data Context on Databricks Spark cluster
We have not yet tested this, but some of our users have had success with Databricks → s3 so please comment here with any concerns or success stories!