Desperate for examples of full custom expectations

mhankin · March 26, 2025, 1:24am

I cannot for the life of me find good examples of custom expectations written for v1.x. I’m only working in pandas, so I’d think it would be easy but I’m running into a really tough time, and now I’m down to the wire.

If it helps, the two things I’m trying to do are:

detect rows outside of n IQRs of the median of a column
for a table that includes two columns Date and HourEnding, confirm that those columns for a unique key AND that all rows in a reference table appear in the target table as long as their Date is between the earliest and latest in the target table.

I’ve spent days trying to parse everything in the repo under great_expectations/great_expectations/expectations at develop · great-expectations/great_expectations · GitHub and under docs/ snippets: great_expectations/docs/docusaurus/docs/snippets at develop · great-expectations/great_expectations · GitHub to no avail. I’ve also tried using the v0.18 tutorials but things seem to have changed a lot since then and the release notes just don’t explain any of the breaking changes.

If anyone can point me in the right direction, I’d really appreciate it. Alternatively, if someone would be willing to work with me for an hour, I’d happily pay a consulting fee.

Thanks,
Mike

OhadE · March 26, 2025, 1:40am

Here is my spark data source I wrote, I think it should be pretty strait forward to change it to pandas. simply do add_pandas instead of add_spark

expectation_suite_name = "my_expectation_suite_name"
suite = context.suites.add(
    gx.ExpectationSuite(
        name=expectation_suite_name
    )
)

suite.add_expectation(
    gxe.ExpectColumnValuesToNotBeNull(
        column="id",
        meta={
            "priority": "P1",
            "notes": {
                "format": "markdown",
                "content": "Critical! Missing KEY for Construction Method Category - Hard Delete",
            },
        },
    )
)

result_format = {
    "result_format": "COMPLETE",
    "return_unexpected_index_query": True
}

df = my_get_df_method(...)

data_source = self.context.data_sources.add_spark(
    name="my_spark",
    persist=True
)

data_asset = data_source.add_dataframe_asset(name=data_asset_name)

batch_definition = data_asset.add_batch_definition_whole_dataframe(
    name="my_data_asset_name"
)

validation_definition = gx.ValidationDefinition(
    data=batch_definition,
    suite=suite,
    name=data_asset_name
)

context.validation_definitions.add(validation_definition)

batch_parameters = {"dataframe": df}

validation_result = validation_definition.run(
    batch_parameters=batch_parameters,
    result_format=result_format
)

adeola · March 31, 2025, 2:07pm

hi Mike,

these pages in our docs weren’t helpful for getting started?

If not, please share where you are facing the challenge.

Topic		Replies	Views
How to create custom Expectations for Spark Archive how-to , help-wanted	9	2719	June 3, 2021
How to create custom expectations using GX Core 1.2.0 version? GX Core Support how-to , help-wanted , databricks	2	273	January 24, 2025
For the custom expectation do we need to register the expectation GX Core Support	3	72	June 9, 2025
How to actually create a custom Query Expectations GX Core Support how-to , types-of-expectation	2	641	December 8, 2023
Create Custom Expectations using pandas dataframe in ingestion layer Archive	1	557	October 14, 2020

Desperate for examples of full custom expectations

Related topics