Get SparkSession in executor
It’s a different story in spark2 and spark3. Basically, there are two ways:
SparkSession
.getActiveSession
which returnsOption[SparkSession]
SparkSession.builder().getOrCreate()
For #1, it always return None in executor according to the code.
For #2, getOrCreate()
will assertOnDriver()
first before goes into the main code flow. It’s different in spark2 and spark3.
In spark2(2.4.6), the code per my understanding is weird. Only when function is executed in test and on executor, exception will be thrown. That means it’s possible to create SparkSession in executor if it’s not in test mode. It’s meaningless as code finally will run in production and behavior in test and prod is not consistent.
In spark3(3.4.1), spark.executor.allowSparkContext
is added to allow config whether SparkSession is able to be created in executor. And the assertOnDriver
only checks whether it’s running on executor which is more reasonable.