Implementing GX Core to process multiple files/tables

mara557 · October 25, 2025, 1:53pm

Hello everyone,

I’m new to this and would like to ask whether it’s possible to implement the GX Core library so that it can run across multiple, unrelated files.

We handle several projects where we may ingest dozens of files each day, and I need to create expectations to validate data integrity. My current process imports the files, performs cleanup by removing fully null rows, standardising column names, and then saves them as Parquet files. These are later loaded into a Lakehouse Delta table, where I run further checks to ensure column names are correct and verify the row counts between the raw file and the landed table (taking into account the deleted empty rows), among other validations.

From my reading of the GX Core documentation, it seems that columns need to be explicitly defined and that multi-file processing isn’t directly supported. I do like the features offered by GX Core - particularly the documentation capabilities, which would be very useful for our workflow.

I’m also considering creating a custom expectation in Python that scans a column, determines the most frequently used format within that column, assumes it as the expected format, and then reports any anomalies against it.

Before I proceed further, I’d be interested to know how others have implemented GX Core in similar projects. Thank you.

I’m currently working on the ingestion process, while the rest of the work is handled by our data engineering team.

Topic		Replies	Views
[newbie] - sample project to get started with? GX Core Support how-to	5	651	November 19, 2024
How to create custom expectations using GX Core 1.2.0 version? GX Core Support how-to , help-wanted , databricks	2	366	January 24, 2025
GX in a Production Setting GX Core Support	1	119	March 31, 2025
Expectations across two tables/assets GX Core Support	0	4	February 17, 2026
How to Create Custom Expectation of type Multicolumn Map Expectation GX Core Support how-to , help-wanted , databricks	1	134	January 30, 2025

Implementing GX Core to process multiple files/tables

Related topics