From a GE User:
I’m hoping to generate distribution visuals of a particular column as shown in the link below. Do I have to profile the data source first, or can I use the expect_column_kl_divergence_to_be_less_than
expectation to visualize a histogram of counts for a column?
https://docs.greatexpectations.io/en/0.7.11/guides/profiling.html
@sam Do you mind chiming in with some thoughts on this question?
Here is how to run profile a data asset in order to produce descriptive documentation for it.
Note: these instructions apply only to V2 API. We will publish a guide for V3 API separately.
These instructions assume that you already have great_expectations.yml with a configured Datasource.
This example profiles a CSV file with Pandas.
import json
import great_expectations as ge
import great_expectations.jupyter_ux
from great_expectations.datasource.types import BatchKwargs
import datetimecontext = ge.data_context.DataContext()
datasource_name = YOUR DATASOURCE NAME
batch_kwargs = {‘path’: “MY CSV FILE PATH”, ‘datasource’: datasource_name}
prof_res = context.profile_data_asset(
datasource_name,
batch_kwargs=batch_kwargs,
)
Then run:
great_expectations docs build
Your Data Docs will be updated with descriptive profiling results.
Hi! is the published guide for V3 API available already? If yes, where to find it?