I am using GE to run assertions against a dataset in Redshift!
Our Redshift cluster requires us to use rotating secrets via AWS secrets manager. In this connection method, an IAM user is granted a temporary username and password with access to our Redshift cluster for 900 seconds. This secret is obtained via a method in boto3 (get_cluster_credentials)
My understanding is that GE datasources can currently only be configured to use hardcoded credentials. Is it possible to run GE using temporary credentials obtained at runtime?
1 Like
Was able to figure this out with some help.
- Configure the Redshift datasource to use environmental variables, as Eugene explained in this helpful post.
datasources:
datawarehouse:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name:
credentials:
drivername: postgresql+psycopg2
host: myRedshiftHost
port: '5439'
database: myRedshiftDb
username: ${GE_REDSHIFT_USERNAME}
password: ${GE_REDSHIFT_PASSWORD}
- Before invoking GE, set environmental variables using a separate python script. See this StackOverflow post for an explanation of how to do in boto3.
cluster_creds = boto3.client('redshift').get_cluster_credentials(DbUser=RedshiftUser,
DbName=RedshiftDb,
ClusterIdentifier=RedshiftClusterId,
AutoCreate=False)
os.environ['GE_REDSHIFT_USERNAME'] = cluster_creds['DbUser']
os.environ['GE_REDSHIFT_PASSWORD'] = cluster_creds['DbPassword']
2 Likes