I have this folder structure for my Great Expectations project:
great_expectations/
dataset/
__init__.py
oracle_dataset.py
datasource/
__init__.py
oracle_datasource.py
great_expectations.yml
datasource/__init__.py
:
from .oracle_datasource import OracleDatasource
dataset/__init__.py
:
from .oracle_dataset import OracleDataset
great_expectations.yml
:
datasources:
db_name:
credentials: ${db_name}
data_asset_type:
class_name: OracleDataset
module_name: .dataset
class_name: OracleDatasource
module_name: .datasource
On top of the fact that python relative imports are very confusing to me, I also am not sure which reference directory is used when running the great_expectations commands. When I try: great_expectations suite new
I get the error message: ValueError: no package specified for '.datasource' (required for relative module names)
I think the above .yml is still the way to go after trying everything below. I’m guessing there is something I don’t understand about relative imports that needs to be handled in the init or elsewhere.
Edit: I have also tried:
datasources:
db_name:
credentials: ${db_name}
data_asset_type:
class_name: OracleDataset
module_name: great_expectations.dataset
class_name: OracleDatasource
module_name: great_expectations.datasource
The module: 'great_expectations.datasource' does not contain the class: 'OracleDatasource'.
I think this message means it is looking in the great_expectations library.
and this:
datasources:
db_name:
credentials: ${db_name}
data_asset_type:
class_name: OracleDataset
module_name: dataset
class_name: OracleDatasource
module_name: datasource
No module named "datasource" could be found in the repository. Please make sure that the file, corresponding to this package and module, exists and that dynamic loading of code modules, templates, and assets is supported in your execution environment. This error is unrecoverable.
I think this means it is looking outside of the library but can’t find the file.
and this:
datasources:
db_name:
credentials: ${db_name}
data_asset_type:
class_name: OracleDataset
module_name: dataset.oracle_dataset
class_name: OracleDatasource
module_name: datasource.oracle_datasource
No module named "datasource.oracle_datasource" could be found in the repository. Please make sure that the file, corresponding to this package and module, exists and that dynamic loading of code modules, templates, and assets is supported in your execution environment. This error is unrecoverable.
I think this means it is looking outside of the library but can’t find the file.