Data Docs in Azure ADLS $web subdirectory not working

Hi Rachel,

thank you for your additional reply. I first installed the latest version 0.18.10 on my Databricks cluster.
As you suggested, I’ve changed the SimpleCheckpoint to a regular Checkpoint.

context.add_or_update_checkpoint(
        name = onboarding_checkpoint_name,
        batch_request=batch_request,
        expectation_suite_name=onboarding_suite_name,
    )

Furthermore, I experimented a bit with the parameters container, prefix, and filepath_prefix.

When using this setting:

"container": "\\$web", 
"prefix": "ct10/", 
"filepath_prefix": "ct10/",

and execute the following checkpoint
datasources_block_checkpoint_result = context.get_checkpoint(“onboarding_checkpoint_name").run()

I get the following error message:
great_expectations.exceptions.exceptions.StoreBackendError: Unable to initialize TupleStoreBackend: filepath_prefix may not end with a forbidden substring. Current forbidden substrings are ['/', '\\']

For all other variants with no slash at the end of filepath_prefix I got a tracelog like in error-message-1.

I forked the code it in Git from version 0.18.x and tried to narrow down the error in VSCode.
Unfortunately, this was beyond my current programming skills. :blush:

But I’ve found that the GX code is quite capable of writing “something” to the ADLS container with these parameters. However, as follows.

When I use this settings

"container":  "\$web",
#"prefix": "ct10/",
#"filepath_prefix": "ct10",

files are being written as shown in Pic1 marked in yellow.
GX puts files in subdirecories like expectations and validations and an creates an index.html you can open in the browser to see the results and the links in these .html-files are working properly.

But when I use these settings:

"container":  "\$web",
#"prefix": "ct10/",
"filepath_prefix": "ct10",

files are being written as shown in Pic1 markesd in turquoise.
GX puts everything in one subirectory (ct10/). And the index.html is missing.
That’s also imho the reason why error-message-1 occurs, because he can’t find the index.html to modify it.
But I didn’t figure out why?

Pic1

error-message-1:

Error running action with name update_data_docs
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/validation_operators/validation_operators.py", line 478, in _run_actions
    action_result = self.actions[name].run(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/checkpoint/actions.py", line 100, in run
    return self._run(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/checkpoint/actions.py", line 1182, in _run
    self.data_context.build_data_docs(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/core/usage_statistics/usage_statistics.py", line 266, in usage_statistics_wrapped_method
    result = func(*args, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 5306, in build_data_docs
    return self._build_data_docs(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 5353, in _build_data_docs
    index_page_resource_identifier_tuple = site_builder.build(
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/render/renderer/site_builder.py", line 323, in build
    _, index_links_dict = self.site_index_builder.build(build_index=build_index)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/render/renderer/site_builder.py", line 756, in build
    self._add_expectations_to_index_links(index_links_dict, skip_and_clean_missing)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/render/renderer/site_builder.py", line 813, in _add_expectations_to_index_links
    ].remove_key(expectation_suite_site_key)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/great_expectations/data_context/store/tuple_store_backend.py", line 1201, in remove_key
    blob.delete_blob()
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/storage/blob/_blob_client.py", line 1211, in delete_blob
    process_storage_error(error)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/storage/blob/_shared/response_handlers.py", line 184, in process_storage_error
    exec("raise error from None")   # pylint: disable=exec-used # nosec
  File "<string>", line 1, in <module>
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/storage/blob/_blob_client.py", line 1209, in delete_blob
    self._client.blob.delete(**options)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/core/tracing/decorator.py", line 78, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/storage/blob/_generated/operations/_blob_operations.py", line 2116, in delete
    map_error(status_code=response.status_code, response=response, error_map=error_map)
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/azure/core/exceptions.py", line 164, in map_error
    raise error
azure.core.exceptions.ResourceNotFoundError: The specified blob does not exist.
RequestId:78ae661c-901e-0066-6e15-6c609b000000
Time:2024-03-01T20:17:12.5179477Z
ErrorCode:BlobNotFound
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.
RequestId:78ae661c-901e-0066-6e15-6c609b000000
Time:2024-03-01T20:17:12.5179477Z</Message></Error>