Spark Connect to S3

Hi All,

I’m using InferredAssetS3DataConnector to connect to S3, but it is not working returning a message “ValueError: S3 query may not have been configured correctly.”
After debug your code I found a bug related to prefix transformation, in the line:

self._prefix = os.path.join(prefix, "")

That line adds a trailing slash to the prefix, you can easily replicate the issue using the code:

import os
prefix = "weather.csv"

prefix_new = os.path.join(prefix, "")
print(prefix_new) 

It returns weather.csv/ instead of weather.csv as expected…
The same issue occurs in the class “ConfiguredAssetS3DataConnector”, can you please consider change it to:

self._bucket = bucket
*self._prefix = os.path.join(prefix, "") # causes the issue*
*self._prefix = prefix # potential  solution*
self._delimiter = delimiter
self._max_keys = max_keys

Can you please take a look and fix it in the next release?

With Best Regards
Xavier