Hi @hdamczy, apologies for the delay.
I dug into this some more, and was able to get the tutorial writing to my directory of choice on Databricks - I think this is what you’re looking for. There’s two things that I had to adjust to get it working.
- Set the context project root directory. The
project_dir
here can be a little tricky, if you’re using a DBFS directory (that otherwise you might access viadbutils.fs.ls("/tmp/discourse1427")
, you want to make sure you begin the path with/dbfs
as shown below. If the directory you’re using is not on DBFS but available at some other path on the Databricks machine you’re running the code, just use the full path name.
project_dir = "/dbfs/tmp/discourse1427/"
context = gx.get_context(project_root_dir=project_dir)
- The Data Docs config automatically creates the default
local_site
site with a temp directory path - this is what you’re seeing in the most recent output you included. Instead, create a new site config with your desired file path in thebase_directory
and remove thelocal_site
like this:
context.add_data_docs_site(
site_config={
"class_name": "SiteBuilder",
"store_backend": {
"class_name": "TupleFilesystemStoreBackend",
"base_directory": "/dbfs/tmp/discourse1427/data_docs",
},
"site_index_builder": {"class_name": "DefaultSiteIndexBuilder"},
},
site_name="my_new_data_docs_site",
)
context.delete_data_docs_site(site_name="local_site")
print(f"context_root_dir: {context_root_dir}")
print(f"context: {context}")
After I made these changes, I was able to run through the tutorial and verify that Data Docs output was written to the folder I specified in the site config base_directory
.