Use S3 as data source 2022

I have problems to connect to our AWS S3 bucket.

I have followed the manual and have this configuration file

enaible_yaml = f"""
name: enaible_s3_datasource
class_name: Datasource
execution_engine:
    class_name: PandasExecutionEngine
data_connectors:
    default_runtime_data_connector_name:
        class_name: RuntimeDataConnector
        batch_identifiers:
            - default_identifier_name
    default_inferred_data_connector_name:
        class_name: InferredAssetS3DataConnector
        bucket: enaible-public-data
        prefix: data_quality/final
        default_regex:
          pattern: (.*)/(.*)\.parquet
          group_names:
            - prefix
            - data_asset_name
"""
print(example_yaml)

But get this error when I test the yaml file.

ValueError: S3 query may not have been configured correctly.

I need to load parquet files also rather than csv.
great_expectations, version 0.15.29

Any enlightenment will be appreciated.

Thanks in advance

I can’t offer a solution but have you tried testing this against a newly created empty S3 bucket?

See: great_expectations/util.py at 5c21d539dd5f280fa0afb55a506b1e51d871bfbf · great-expectations/great_expectations · GitHub

which seems to indicate that the S3 bucket might be misconfigured. I successfully connected to an S3 bucket with the same configuration to yours with no error with GE 0.15.32