Sure!
Stock market data is a good example of this. Suppose we are using intraday stock market data for CAT
. Picking one source, the alphavantage api, we’d get multiple datapoints for each day, depending on the interval we selected. To ensure that all of the data are reasonable, I might set an expectation like expect_column_min_to_be_between("close", 200, 300)
. That way, I’d know that there were no anomalous closing prices in that day–I’m guarding against cases where the wrong ticker is reported, or I need to add logic for a stock split, for example.
If I wanted to get more sophisticated, I might use evaluation parameters to ensure that the min value for today is within some range of yesterday’s min value. In the OSS library there’s a fair amount of work to maintain a store of those historical metrics, but you can set up the expectation like this:
expect_column_min_to_be_between("close", {"$PARAMETER": "prev_min_close * 0.9"}, {"$PARAMETER": "prev_min_close * 1.1"})
In that case, you’d provide the previous close when you run the validation, and GX will validate that the minimum in the new dataset is within +/- 10%.