I am trying to execute pyspark script via emr. Script will process files from S3 bucket and put into another folder. Code was working fine till march first week. But suddenly getting error at the initial phase where datacontext was getting created. Below is the code that is responsible for data context
Code:
def build_dq_context(bucket):
data_context_config = DataContextConfig(
datasources={
DQ_DATASOURCE_NAME: DatasourceConfig(
class_name=“SparkDFDatasource”
)
},
store_backend_defaults=S3StoreBackendDefaults(default_bucket_name=bucket),
)
context = BaseDataContext(project_config=data_context_config)
return context
dq_data_context = build_dq_context(bucket=head_event['bucket'])
Error:
INFO:great_expectations.core.util:Stopping existing spark context to reconfigure.
Traceback (most recent call last):
File “/mnt/tmp/spark-134283bd-cba4-4325-a5e3-6a0b711f6be4/etl.py”, line 548, in
defopt.run(main)
File “/usr/local/lib/python3.7/site-packages/defopt.py”, line 169, in run
return _call_function(parser, args._func, args)
File “/usr/local/lib/python3.7/site-packages/defopt.py”, line 516, in _call_function
return func(*positionals, **keywords)
File “/mnt/tmp/spark-134283bd-cba4-4325-a5e3-6a0b711f6be4/etl.py”, line 405, in main
Lookup_parent = spark.createDataFrame(,‘cnsmr_prof_sid STRING’)
File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py”, line 605, in createDataFrame
File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py”, line 630, in _create_dataframe
File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py”, line 465, in _createFromLocal
File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py”, line 515, in parallelize
File “/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py”, line 441, in defaultParallelism
AttributeError: ‘NoneType’ object has no attribute ‘sc’
When i tried to manually run it in Jupiter Notebook, getting below error when i use class_name="SparkDFDatasource
Error:
An error was encountered:
Invalid status code ‘400’ from http://localhost:8998/sessions/6/statements/1 with error payload: {“msg”:“requirement failed: Session isn’t active.”}
Can anyone provide some input on this?