Databricks and dataframes using serverless compute

nevintan · February 12, 2025, 3:12pm

View in #gx-community-support on Slack

@Shannon_Broskie: Good morning! Quick question regarding the requirements to run in Databricks. I am testing out the framework for the first time and got to the point where a batch.validate was being run. This was on a Spark dataframe. I received an exception stating: “[NOT_SUPPORTED_WITH_SERVERLESS] PERSIST TABLE is not supported on serverless compute.”

All of our compute is set up as serverless for a variety of architectual reasons. Does great expectations not work at all if the only option I have is serverless compute? Just making sure I don’t go too far down this path if it will not work with the way our Databricks compute is set up.

Thanks!

@Ken: @Tyler_Hoffman_(GX) can you provide some insight here?

@Tyler_Hoffman_(GX): Hey @Shannon_Broskie, we attempt to persist the dataframe by default as a performance optimization, but since serverless compute doesn’t allow that, you can opt your datasource out of persisting it by passing persist=False when you create your datasource, e.g. context.data_sources.add_spark(name=name, persist=False) . Hope that helps!

@Shannon_Broskie: Thank you very much Tyler and Ken. I’ll give this a try as soon as I can.
The persist=false worked. I was able to run that first expectation. Thanks for the assistance!

@Tyler_Hoffman_(GX): Awesome, glad to hear it!

Topic		Replies	Views
Not able to create expectation suite and data docs in databricks using spark GX Core Support	0	35	July 9, 2025
Not able to set persist to False using Spark Execution Engine GX Core Support	1	244	August 8, 2023
GX-Databricks:Datasource-Data asset - Validator GX Core Support databricks , datasource	8	457	December 19, 2024
How to instantiate a Data Context on an Databricks Spark cluster Archive	20	3062	July 30, 2021
Parallel Execution of Great Expectation Validations Feedback	0	440	June 5, 2023

Databricks and dataframes using serverless compute

Related topics