Why would a minimum value be defined in a range?

I’m new to GX and trying to get my head around everything. Looking at the [expect_column_min_to_be_between](https://greatexpectations.io/expectations/expect_column_min_to_be_between), I’m just not groking why I would set a minimum and maximum value for what the minimum value of a column should be. If the minimum value should be between 3 and 5, it would seem like that means the minimum is 3.

Am I missing something really fundamental about that expectation or is the setting of a minimum and maximum value an artifact of something else and I just need to accept that I have to set both parameters to be whatever value I want for the minimum value?

You’re right that if you know the exact range of values that a column will have then you would know the exact minimum. But, if there is variation between different batches of data, then it may make more sense to expect the minimum of any given batch to fall in a range. For example, you might expect a minimum transaction value between 0 and 1 dollars if you may have variation inside a day, but you want to ensure there are always small transactions in your data.

I appreciate you responding, @jpcampbell42. I’m afraid I’m still not seeing it though. Can you give a real world example of when a range for the minimum would be appropriate?


Stock market data is a good example of this. Suppose we are using intraday stock market data for CAT. Picking one source, the alphavantage api, we’d get multiple datapoints for each day, depending on the interval we selected. To ensure that all of the data are reasonable, I might set an expectation like expect_column_min_to_be_between("close", 200, 300). That way, I’d know that there were no anomalous closing prices in that day–I’m guarding against cases where the wrong ticker is reported, or I need to add logic for a stock split, for example.

If I wanted to get more sophisticated, I might use evaluation parameters to ensure that the min value for today is within some range of yesterday’s min value. In the OSS library there’s a fair amount of work to maintain a store of those historical metrics, but you can set up the expectation like this:
expect_column_min_to_be_between("close", {"$PARAMETER": "prev_min_close * 0.9"}, {"$PARAMETER": "prev_min_close * 1.1"})

In that case, you’d provide the previous close when you run the validation, and GX will validate that the minimum in the new dataset is within +/- 10%.

Ah! Thank you. The parameterization example really makes the need for that clear.