How to create a delta table data source on databricks

nora001 · June 22, 2021, 5:48pm

Hello,

I’m trying to create a data source for a delta table, I’m working fully on databricks(so I’m not using great_expectations locally).

my_spark_datasource_config = DatasourceConfig(
class_name=“SparkDFDatasource”,
batch_kwargs_generators={
“subdir_reader”: {
“class_name”: “DatabricksTableBatchKwargsGenerator”,
“base_directory”: “delta_table_name or path”,
“reader_method” : “delta”
}
},
)

But I get the feeling that this is not the way to do it? I feel like adding the directory is not really useful(and addng the path isn’t useful for delta lake since the directory contains snappy.parquet files…), I can just leave it blank(is this bad practice?) and select the rows I want from the delta table into a dataframe and test out the dataframe without indicating the delta table in the first place…but that defeats the purpose of a datasource in the first place, no?

Sorry I’m a noob at using GE and I’m a bit lost reading the documentation

Topic		Replies	Views
How to set datasource as delta table on Databricks Archive how-to , help-wanted	0	741	March 7, 2022
GE with Databricks Delta Archive	3	3161	May 4, 2020
How to connect to data on Databricks Unity Catalog using Spark GX Core Support how-to , help-wanted , databricks , datasource	3	1159	November 2, 2023
How to configure a Databricks AWS Datasource Archive how-to , help-wanted	4	670	March 10, 2021
Reading delta table data with supported 'DirectoryDeltaAsset' or 'DeltaAsset' GX Core Support how-to	1	106	January 7, 2025

How to create a delta table data source on databricks

Related topics