Hello GX community!
tl;dr: should GX work with Postgres (15.1) and SQLAlchemy==1.3.24?
I’m currently doing research into multiple data observability/data integrity tools at work; GX is obviously one of them.
I’m going step-by-step through the “SQL” tutorial.
For context, the rest of the post will refer to a test
table in the postgres
database in the “system” Postgres.
I created it to make sure that it isn’t a problem with a dockerized postgres instance that GX will actually connect to (and which I initially used).
The issue described later occurs with both, which leads me to think that this isn’t a problem with Postgres nor the tables. Also, it works with SQLAlchemy >= 2.0
Steps to reproduce:
Setting up the environment:
mkdir delete_me
cd delete_me
virtualenv .venv -p python3.9
. .venv/bin/activate
pip install great-expectations==0.18.14
pip install sqlalchemy==1.3.24
pip install psycopg2==2.9.9
Let’s go through the tutorial then! In python REPL:
import great_expectations as gx
path_to_dir = "./contexts"
context = gx.get_context(project_root_dir=path_to_dir)
# Connection string confirmed to work by running psql postgresql://Work@localhost:5432/postgres in system shell
connection_string = "postgresql://Work@localhost:5432/postgres"
local = context.sources.add_postgres(name="local", connection_string=connection_string)
# Works until here
local.add_table_asset(name="test", table_name="test") # Fails here
Stacktrace as follows:
Traceback (most recent call last):
File "~/delete_me/.venv/lib/python3.9/site-packages/great_expectations/datasource/fluent/sql_datasource.py", line 928, in test_connection
connection.execute(sa.select(1, table).limit(1))
File "<string>", line 2, in select
File "<string>", line 2, in __init__
File "~/delete_me/.venv/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 139, in warned
return fn(*args, **kwargs)
File "~/delete_me/.venv/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py", line 3117, in __init__
for c in columns:
TypeError: 'int' object is not iterable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/delete_me/.venv/lib/python3.9/site-packages/great_expectations/datasource/fluent/sql_datasource.py", line 1185, in add_table_asset
return self._add_asset(asset)
File "~/delete_me/.venv/lib/python3.9/site-packages/great_expectations/datasource/fluent/interfaces.py", line 553, in _add_asset
asset.test_connection()
File "~/delete_me/.venv/lib/python3.9/site-packages/great_expectations/datasource/fluent/sql_datasource.py", line 933, in test_connection
raise TestConnectionError(
great_expectations.datasource.fluent.interfaces.TestConnectionError: Attempt to connect to table: test failed because the test query failed. Ensure the table exists and the user has access to select data from the table: 'int' object is not iterable
I have set a breakpoint()
in delete_me/.venv/lib/python3.9/site-packages/sqlalchemy/sql/selectable.py
(link to github) to find out that:
-
argument/variable “columns” should be either a
ColumnElement
, or aFromClause
(typicallyTable
orAlias
, to be extracted to a collection ofColumnElement
s); also acceptable areTextClause
or ORM-mapped classes (from docstring). -
columns
is a list containing only one element of typesqlalchemy.sql.elements.Cast
(repr:CAST('test plain returns' AS VARCHAR(60))
) - which as far as I can tell is not what should be passed.
I tried replicating the same steps while on SQLAlchemy>=2.0 and it worked without any issues.
Which leads us back to the initial question: should GX work with SQLAlchemy==1.3.24 and Postgres? I tried: googling for an answer, looked for one in the docs, searched open and closed GitHub issues, and on this forum – but I was unable to find anything useful; hell, it even seems that nobody ran into a similar issue earlier, which gives me pause.
System/versions info:
- Python 3.9.19
- great-expectations==0.18.14
- SQLAlchemy==1.3.24
- psycopg2==2.9.9
- PostgreSQL 15.1 on x86_64-apple-darwin20.6.0, compiled by Apple clang version 12.0.0 (clang-1200.0.32.29), 64-bit
- MacOS Monterey 12.5 (Apple Silicon/M1 Pro chip)