Get SparkSession in executor

spark

It’s a different story in spark2 and spark3. Basically, there are two ways:

  1. SparkSession.getActiveSession which returns Option[SparkSession]
  2. SparkSession.builder().getOrCreate()

For #1, it always return None in executor according to the code.

e916a9cb-9f68-485d-8eff-c613259037a4

For #2, getOrCreate() will assertOnDriver() first before goes into the main code flow. It’s different in spark2 and spark3.

In spark2(2.4.6), the code per my understanding is weird. Only when function is executed in test and on executor, exception will be thrown. That means it’s possible to create SparkSession in executor if it’s not in test mode. It’s meaningless as code finally will run in production and behavior in test and prod is not consistent.

a2e7ca53-f538-422d-8477-c79ba7f54dd9

In spark3(3.4.1), spark.executor.allowSparkContext is added to allow config whether SparkSession is able to be created in executor. And the assertOnDriver only checks whether it’s running on executor which is more reasonable.

819b2ce4-0d7a-4931-a95d-9d48ae723fec

45275617-196b-40b9-a5e9-cd2b4888b5c6