The default mode is that when you want to create an Expectation Suite, you create a batch out of sample data and add Expectations to the suite by calling expect_* methods. These method calls evaluate the Expectation against the sample batch and add it to the suite.
@ryan describes another great use case, when you want to add Expectation to a suite relying on your knowledge and you don’t want to validate them on a sample (or the sample does not exist).
Approach 1
Keep using the code that adds Expectations to a suite and runs validation as it goes, but “trick” it by providing a empty “dummy” batch of data. Set the interactive_evaluation flag to False to save the time and make sure the validation does not complain. The code snippet below shows a complete example.
The advantage of this approach is that when you start writing “batch.expect”, auto-complete in Jupyter or on IDE of you choice will display all the expectation types and their documentation.
import great_expectations as ge
import pandas as pd
context = ge.data_context.DataContext()
expectation_suite_name = “my_new_expectation_suite” #TODO: replace with the name of your suite
suite = context.create_expectation_suite(
expectation_suite_name, overwrite_existing=True
)
batch_kwargs = {
“dataset”: pd.DataFrame(), # create a simplest “dummy” dataset
“datasource”: “my_datasource”, #TODO: replace with the name of your datasource
“data_asset_name”: “some_name”,
}
batch = context.get_batch(batch_kwargs, suite)
batch.set_config_value(“interactive_evaluation”, False)
batch.expect_column_max_to_be_between(“mycolumn”, min_value=1, max_value=100)
TODO: add more expectations
suite = batch.get_expectation_suite()
context.save_expectation_suite(suite, expectation_suite_name)
Approach 2
Don’t create a batch, not even a “dummy” one. Instead, you create an Expectation Suite and then add Expectations to it by specifying their configurations.
This notebook contains a compete example:
https://github.com/great-expectations/great_expectations/blob/eugene/example_notebook_create_suite_without_sample_202012/examples/notebooks/create_expectation_suite_without_sample_data.ipynb
This short video provides the details:
https://www.loom.com/share/4eb133fadf6e427984e531573b661e29
Please tell us in the comments which approach you like more and why.