View in #gx-community-support on Slack
@Shannon_Broskie: Good morning! Quick question regarding the requirements to run in Databricks. I am testing out the framework for the first time and got to the point where a batch.validate was being run. This was on a Spark dataframe. I received an exception stating: “[NOT_SUPPORTED_WITH_SERVERLESS] PERSIST TABLE is not supported on serverless compute.”
All of our compute is set up as serverless for a variety of architectual reasons. Does great expectations not work at all if the only option I have is serverless compute? Just making sure I don’t go too far down this path if it will not work with the way our Databricks compute is set up.
Thanks!
@Ken: @Tyler_Hoffman_(GX) can you provide some insight here?
@Tyler_Hoffman_(GX): Hey @Shannon_Broskie, we attempt to persist the dataframe by default as a performance optimization, but since serverless compute doesn’t allow that, you can opt your datasource out of persisting it by passing persist=False
when you create your datasource, e.g. context.data_sources.add_spark(name=name, persist=False)
. Hope that helps!
@Shannon_Broskie: Thank you very much Tyler and Ken. I’ll give this a try as soon as I can.
The persist=false worked. I was able to run that first expectation. Thanks for the assistance!
@Tyler_Hoffman_(GX): Awesome, glad to hear it!