Can I use Great Expectations to profile data and produce documentation and visuals without having to invoke all the functionality around expectations and validation results?

bhcastleton · October 13, 2020, 9:56pm

From a GE User:
I’m hoping to generate distribution visuals of a particular column as shown in the link below. Do I have to profile the data source first, or can I use the expect_column_kl_divergence_to_be_less_than expectation to visualize a histogram of counts for a column?
https://docs.greatexpectations.io/en/0.7.11/guides/profiling.html

bhcastleton · October 14, 2020, 7:10pm

@sam Do you mind chiming in with some thoughts on this question?

eugene.mandel · March 30, 2021, 6:10am

Here is how to run profile a data asset in order to produce descriptive documentation for it.

Note: these instructions apply only to V2 API. We will publish a guide for V3 API separately.

These instructions assume that you already have great_expectations.yml with a configured Datasource.
This example profiles a CSV file with Pandas.

import json
import great_expectations as ge
import great_expectations.jupyter_ux
from great_expectations.datasource.types import BatchKwargs
import datetime

context = ge.data_context.DataContext()

datasource_name = YOUR DATASOURCE NAME

batch_kwargs = {‘path’: “MY CSV FILE PATH”, ‘datasource’: datasource_name}

prof_res = context.profile_data_asset(
datasource_name,
batch_kwargs=batch_kwargs,
)

Then run:

great_expectations docs build

Your Data Docs will be updated with descriptive profiling results.

lodsdevera · September 13, 2022, 7:52am

Hi! is the published guide for V3 API available already? If yes, where to find it?

Topic		Replies	Views
How to profile data and put profile info to data doc GX Core Support	1	192	October 24, 2024
Unable to see profiled stats on data docs GX Core Support help-wanted	3	177	April 22, 2024
We have Great Expectations for Pandas Profiling \| Great Expectations Archive	0	654	February 26, 2021
How to implement my own custom profiler? Archive	2	836	September 2, 2020
Profiling a dataset / scaffolding an expectation suite without using the CLI (e.g. spark, databricks) Archive how-to	2	1156	October 23, 2020

Can I use Great Expectations to profile data and produce documentation and visuals without having to invoke all the functionality around expectations and validation results?

Related topics