Spark-源码分析02-Luanch Executor

1.SparkContext.scala

sparkcontext 在被new的时候，会执行class中的代码

其中有一个就是创建TaskScheduler 和 SchedulerBackend，而SchedulerBackend 就是driver 和外界通信的，我理解SchedulerBackend 就是粗粒度的Driver。

创建TaskScheduler的同时，对TaskScheduler初始化 scheduler.initialize(backend)

TaskScheduler的M-start

val (sched, ts) = SparkContext.createTaskScheduler(this, master, deployMode)

_schedulerBackend = sched

_taskScheduler = ts

_dagScheduler = new DAGScheduler(this)

_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)

// create and start the heartbeater for collecting memory metrics

_heartbeater = new Heartbeater(env.memoryManager,

() => SparkContext.this.reportHeartBeat(),

"driver-heartbeater",

conf.get(EXECUTOR_HEARTBEAT_INTERVAL))

_heartbeater.start()

// start TaskScheduler after taskScheduler sets DAGScheduler reference in DAGScheduler's

// constructor

_taskScheduler.start()

private def createTaskScheduler(

sc: SparkContext,

master: String,

deployMode: String): (SchedulerBackend, TaskScheduler) = {

import SparkMasterRegex._

// When running locally, don't try to re-execute tasks on failure.

val MAX_LOCAL_TASK_FAILURES = 1

master match {

case "local" =>

val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)

val backend = new LocalSchedulerBackend(sc.getConf, scheduler, 1)

scheduler.initialize(backend)

(backend, scheduler)

case LOCAL_N_REGEX(threads) =>

def localCpuCount: Int = Runtime.getRuntime.availableProcessors()

// local[*] estimates the number of cores on the machine; local[N] uses exactly N threads.

val threadCount = if (threads == "*") localCpuCount else threads.toInt

if (threadCount <= 0) {

throw new SparkException(s"Asked to run locally with $threadCount threads")

}

val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)

val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)

scheduler.initialize(backend)

(backend, scheduler)

case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>

def localCpuCount: Int = Runtime.getRuntime.availableProcessors()

// local[*, M] means the number of cores on the computer with M failures

// local[N, M] means exactly N threads with M failures

val threadCount = if (threads == "*") localCpuCount else threads.toInt

val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)

val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)

scheduler.initialize(backend)

(backend, scheduler)

case SPARK_REGEX(sparkUrl) =>

val scheduler = new TaskSchedulerImpl(sc)

val masterUrls = sparkUrl.split(",").map("spark://" + _)

val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)

scheduler.initialize(backend)

(backend, scheduler)

case LOCAL_CLUSTER_REGEX(numSlaves, coresPerSlave, memoryPerSlave) =>

// Check to make sure memory requested <= memoryPerSlave. Otherwise Spark will just hang.

val memoryPerSlaveInt = memoryPerSlave.toInt

if (sc.executorMemory > memoryPerSlaveInt) {

throw new SparkException(

"Asked to launch cluster with %d MiB RAM / worker but requested %d MiB/worker".format(

memoryPerSlaveInt, sc.executorMemory))

}

val scheduler = new TaskSchedulerImpl(sc)

val localCluster = new LocalSparkCluster(

numSlaves.toInt, coresPerSlave.toInt, memoryPerSlaveInt, sc.conf)

val masterUrls = localCluster.start()

val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)

scheduler.initialize(backend)

backend.shutdownCallback = (backend: StandaloneSchedulerBackend) => {

localCluster.stop()

}

(backend, scheduler)

case masterUrl =>

val cm = getClusterManager(masterUrl) match {

case Some(clusterMgr) => clusterMgr

case None => throw new SparkException("Could not parse Master URL: '" + master + "'")

}

try {

val scheduler = cm.createTaskScheduler(sc, masterUrl)

val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)

cm.initialize(scheduler, backend)

(backend, scheduler)

} catch {

case se: SparkException => throw se

case NonFatal(e) =>

throw new SparkException("External scheduler cannot be instantiated", e)

}

2.TaskSchedulerImpl.scala

M-initialize 也就是把SchedulerBackend 放入到 TaskSchedulerImpl内部和创建一个任务调度器，后面任务调度时候会用到

M-start 就是对SchedulerBackend 调用 start

def initialize(backend: SchedulerBackend) {

this.backend = backend

schedulableBuilder = {

schedulingMode match {

case SchedulingMode.FIFO =>

new FIFOSchedulableBuilder(rootPool)

case SchedulingMode.FAIR =>

new FairSchedulableBuilder(rootPool, conf)

case _ =>

throw new IllegalArgumentException(s"Unsupported $SCHEDULER_MODE_PROPERTY: " +

s"$schedulingMode")

}

schedulableBuilder.buildPools()

}

override def start() {

backend.start()

if (!isLocal && conf.getBoolean("spark.speculation", false)) {

logInfo("Starting speculative execution thread")

speculationScheduler.scheduleWithFixedDelay(new Runnable {

override def run(): Unit = Utils.tryOrStopSparkContext(sc) {

checkSpeculatableTasks()

}

}, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)

}

3.SchedulerBackend.scala

调用M-start 之后就会new StandaloneAppClient

StandaloneAppClient，并把构建的ApplicationDescription（其中包括启动executor的命令Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)）

并调用 start

override def start() {

super.start()

// SPARK-21159. The scheduler backend should only try to connect to the launcher when in client

// mode. In cluster mode, the code that submits the application to the Master needs to connect

// to the launcher instead.

if (sc.deployMode == "client") {

launcherBackend.connect()

}

// The endpoint for executors to talk to us

val driverUrl = RpcEndpointAddress(

sc.conf.get("spark.driver.host"),

sc.conf.get("spark.driver.port").toInt,

CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString

val args = Seq(

"--driver-url", driverUrl,

"--executor-id", "{{EXECUTOR_ID}}",

"--hostname", "{{HOSTNAME}}",

"--cores", "{{CORES}}",

"--app-id", "{{APP_ID}}",

"--worker-url", "{{WORKER_URL}}")

val extraJavaOpts = sc.conf.getOption("spark.executor.extraJavaOptions")

.map(Utils.splitCommandString).getOrElse(Seq.empty)

val classPathEntries = sc.conf.getOption("spark.executor.extraClassPath")

.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

val libraryPathEntries = sc.conf.getOption("spark.executor.extraLibraryPath")

.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

// When testing, expose the parent class path to the child. This is processed by

// compute-classpath.{cmd,sh} and makes all needed jars available to child processes

// when the assembly is built with the "*-provided" profiles enabled.

val testingClassPath =

if (sys.props.contains("spark.testing")) {

sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq

} else {

Nil

}

// Start executors with a few necessary configs for registering with the scheduler

val sparkJavaOpts = Utils.sparkJavaOpts(conf, SparkConf.isExecutorStartupConf)

val javaOpts = sparkJavaOpts ++ extraJavaOpts

val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

val webUrl = sc.ui.map(_.webUrl).getOrElse("")

val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)

// If we're using dynamic allocation, set our initial executor limit to 0 for now.

// ExecutorAllocationManager will send the real initial limit to the Master later.

val initialExecutorLimit =

if (Utils.isDynamicAllocationEnabled(conf)) {

Some(0)

} else {

None

}

val appDesc = ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,

webUrl, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)

client = new StandaloneAppClient(sc.env.rpcEnv, masters, appDesc, this, conf)

client.start()

launcherBackend.setState(SparkAppHandle.State.SUBMITTED)

waitForRegistration()

launcherBackend.setState(SparkAppHandle.State.RUNNING)

}

4.StandaloneAppClient.scala

StandaloneAppClient就是Driver 和外界通信的RpcEndpoint 所有和driver通信都必须创建driver的引用

def start() {

// Just launch an rpcEndpoint; it will call back into the listener.

endpoint.set(rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv)))

}

5.ClientEndpoint.scala (是StandaloneAppClient的内部类)

ClientEndpoint 是一个RPC通信的，也是就Driver的server

启动后，会执行M-onStart ，随后向Master注册同时会把含有启动Executor的命令也放在RegisterApplication 中，发送给Master

private class ClientEndpoint(override val rpcEnv: RpcEnv) extends ThreadSafeRpcEndpoint

override def onStart(): Unit = {

try {

registerWithMaster(1)

} catch {

case e: Exception =>

logWarning("Failed to connect to master", e)

markDisconnected()

stop()

}

private def tryRegisterAllMasters(): Array[JFuture[_]] = {

for (masterAddress <- masterRpcAddresses) yield {

registerMasterThreadPool.submit(new Runnable {

override def run(): Unit = try {

if (registered.get) {

return

}

logInfo("Connecting to master " + masterAddress.toSparkURL + "...")

val masterRef = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)

masterRef.send(RegisterApplication(appDescription, self))

} catch {

case ie: InterruptedException => // Cancelled

case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)

}

})

}

6.Master

在RegisterApplication 中，会invoke registerApplication 和 schedule

registerApplication 就是把启动app 放入缓存，

schedule 调用 startExecutorsOnWorkers

startExecutorsOnWorkers 就是进行调度

allocateWorkerResourceToExecutors 进行资源分配

launchExecutor 向Work 发送启动Executor command LaunchExecutor

case RegisterApplication(description, driver) =>

// TODO Prevent repeated registrations from some driver

if (state == RecoveryState.STANDBY) {

// ignore, don't send response

} else {

logInfo("Registering app " + description.name)

val app = createApplication(description, driver)

registerApplication(app)

logInfo("Registered app " + description.name + " with ID " + app.id)

persistenceEngine.addApplication(app)

driver.send(RegisteredApplication(app.id, self))

schedule()

}

private def registerApplication(app: ApplicationInfo): Unit = {

val appAddress = app.driver.address

if (addressToApp.contains(appAddress)) {

logInfo("Attempted to re-register application at same address: " + appAddress)

return

}

applicationMetricsSystem.registerSource(app.appSource)

apps += app

idToApp(app.id) = app

endpointToApp(app.driver) = app

addressToApp(appAddress) = app

waitingApps += app

}

private def schedule(): Unit = {

startExecutorsOnWorkers()

}

private def startExecutorsOnWorkers(): Unit = {

// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app

// in the queue, then the second app, etc.

for (app <- waitingApps) {

val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)

// If the cores left is less than the coresPerExecutor,the cores left will not be allocated

if (app.coresLeft >= coresPerExecutor) {

// Filter out workers that don't have enough resources to launch an executor

val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)

.filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&

worker.coresFree >= coresPerExecutor)

.sortBy(_.coresFree).reverse

val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

// Now that we've decided how many cores to allocate on each worker, let's allocate them

for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {

allocateWorkerResourceToExecutors(

app, assignedCores(pos), app.desc.coresPerExecutor, usableWorkers(pos))

}

private def allocateWorkerResourceToExecutors(

app: ApplicationInfo,

assignedCores: Int,

coresPerExecutor: Option[Int],

worker: WorkerInfo): Unit = {

// If the number of cores per executor is specified, we divide the cores assigned

// to this worker evenly among the executors with no remainder.

// Otherwise, we launch a single executor that grabs all the assignedCores on this worker.

val numExecutors = coresPerExecutor.map { assignedCores / _ }.getOrElse(1)

val coresToAssign = coresPerExecutor.getOrElse(assignedCores)

for (i <- 1 to numExecutors) {

val exec = app.addExecutor(worker, coresToAssign)

launchExecutor(worker, exec)

app.state = ApplicationState.RUNNING

}

private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {

logInfo("Launching executor " + exec.fullId + " on worker " + worker.id)

worker.addExecutor(exec)

worker.endpoint.send(LaunchExecutor(masterUrl,

exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory))

exec.application.driver.send(

ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))

}

7.Worker

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>

if (masterUrl != activeMasterUrl) {

logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")

} else {

try {

logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))

// Create the executor's working directory

val executorDir = new File(workDir, appId + "/" + execId)

if (!executorDir.mkdirs()) {

throw new IOException("Failed to create directory " + executorDir)

}

// Create local dirs for the executor. These are passed to the executor via the

// SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the

// application finishes.

val appLocalDirs = appDirectories.getOrElse(appId, {

val localRootDirs = Utils.getOrCreateLocalRootDirs(conf)

val dirs = localRootDirs.flatMap { dir =>

try {

val appDir = Utils.createDirectory(dir, namePrefix = "executor")

Utils.chmod700(appDir)

Some(appDir.getAbsolutePath())

} catch {

case e: IOException =>

logWarning(s"${e.getMessage}. Ignoring this directory.")

None

}

}.toSeq

if (dirs.isEmpty) {

throw new IOException("No subfolder can be created in " +

s"${localRootDirs.mkString(",")}.")

}

dirs

})

appDirectories(appId) = appLocalDirs

val manager = new ExecutorRunner(

appId,

execId,

appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),

cores_,

memory_,

self,

workerId,

host,

webUi.boundPort,

publicAddress,

sparkHome,

executorDir,

workerUri,

conf,

appLocalDirs, ExecutorState.RUNNING)

executors(appId + "/" + execId) = manager

manager.start()

coresUsed += cores_

memoryUsed += memory_

sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))

} catch {

case e: Exception =>

logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)

if (executors.contains(appId + "/" + execId)) {

executors(appId + "/" + execId).kill()

executors -= appId + "/" + execId

}

sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,

Some(e.toString), None))

}

8.ExecutorRunner excutor manager 也就是 executor启动类

在StandaloneScheduleBackend 中

val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

val webUrl = sc.ui.map(_.webUrl).getOrElse("")

M-fetchAndRunExecutor 中的command 就是StandaloneScheduleBackend中的command

private[worker] def start() {

workerThread = new Thread("ExecutorRunner for " + fullId) {

override def run() { fetchAndRunExecutor() }

}

workerThread.start()

// Shutdown hook that kills actors on shutdown.

shutdownHook = ShutdownHookManager.addShutdownHook { () =>

// It's possible that we arrive here before calling `fetchAndRunExecutor`, then `state` will

// be `ExecutorState.RUNNING`. In this case, we should set `state` to `FAILED`.

if (state == ExecutorState.RUNNING) {

state = ExecutorState.FAILED

}

killProcess(Some("Worker shutting down")) }

}

private def fetchAndRunExecutor() {

try {

// Launch the process

val subsOpts = appDesc.command.javaOpts.map {

Utils.substituteAppNExecIds(_, appId, execId.toString)

}

val subsCommand = appDesc.command.copy(javaOpts = subsOpts)

val builder = CommandUtils.buildProcessBuilder(subsCommand, new SecurityManager(conf),

memory, sparkHome.getAbsolutePath, substituteVariables)

val command = builder.command()

val formattedCommand = command.asScala.mkString("\"", "\" \"", "\"")

logInfo(s"Launch command: $formattedCommand")

builder.directory(executorDir)

builder.environment.put("SPARK_EXECUTOR_DIRS", appLocalDirs.mkString(File.pathSeparator))

// In case we are running this from within the Spark Shell, avoid creating a "scala"

// parent process for the executor command

builder.environment.put("SPARK_LAUNCH_WITH_SCALA", "0")

// Add webUI log urls

val baseUrl =

if (conf.getBoolean("spark.ui.reverseProxy", false)) {

s"/proxy/$workerId/logPage/?appId=$appId&executorId=$execId&logType="

} else {

s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="

}

builder.environment.put("SPARK_LOG_URL_STDERR", s"${baseUrl}stderr")

builder.environment.put("SPARK_LOG_URL_STDOUT", s"${baseUrl}stdout")

process = builder.start()

val header = "Spark Executor Command: %s\n%s\n\n".format(

formattedCommand, "=" * 40)

// Redirect its stdout and stderr to files

val stdout = new File(executorDir, "stdout")

stdoutAppender = FileAppender(process.getInputStream, stdout, conf)

val stderr = new File(executorDir, "stderr")

Files.write(header, stderr, StandardCharsets.UTF_8)

stderrAppender = FileAppender(process.getErrorStream, stderr, conf)

// Wait for it to exit; executor may exit with code 0 (when driver instructs it to shutdown)

// or with nonzero exit code

val exitCode = process.waitFor()

state = ExecutorState.EXITED

val message = "Command exited with code " + exitCode

worker.send(ExecutorStateChanged(appId, execId, state, Some(message), Some(exitCode)))

} catch {

case interrupted: InterruptedException =>

logInfo("Runner thread for executor " + fullId + " interrupted")

state = ExecutorState.KILLED

killProcess(None)

case e: Exception =>

logError("Error running executor", e)

state = ExecutorState.FAILED

killProcess(Some(e.toString))

}

9. CoarseGrainedExecutorBackend 粗粒度的Exector 主要和Driver 这些通信，也是一个server。里面会有一个线程会启动细粒度的Executor

注意：一个CoarseGrainedExecutorBackend 对应一个 SparkEnv对象

val env = SparkEnv.createExecutorEnv(

driverConf, executorId, hostname, cores, cfg.ioEncryptionKey, isLocal = false)

def main(args: Array[String]) {

var driverUrl: String = null

var executorId: String = null

var hostname: String = null

var cores: Int = 0

var appId: String = null

var workerUrl: Option[String] = None

val userClassPath = new mutable.ListBuffer[URL]()

var argv = args.toList

while (!argv.isEmpty) {

argv match {

case ("--driver-url") :: value :: tail =>

driverUrl = value

argv = tail

case ("--executor-id") :: value :: tail =>

executorId = value

argv = tail

case ("--hostname") :: value :: tail =>

hostname = value

argv = tail

case ("--cores") :: value :: tail =>

cores = value.toInt

argv = tail

case ("--app-id") :: value :: tail =>

appId = value

argv = tail

case ("--worker-url") :: value :: tail =>

// Worker url is used in spark standalone mode to enforce fate-sharing with worker

workerUrl = Some(value)

argv = tail

case ("--user-class-path") :: value :: tail =>

userClassPath += new URL(value)

argv = tail

case Nil =>

case tail =>

// scalastyle:off println

System.err.println(s"Unrecognized options: ${tail.mkString(" ")}")

// scalastyle:on println

printUsageAndExit()

}

if (driverUrl == null || executorId == null || hostname == null || cores <= 0 ||

appId == null) {

printUsageAndExit()

}

run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)

System.exit(0)

}

private def run(

driverUrl: String,

executorId: String,

hostname: String,

cores: Int,

appId: String,

workerUrl: Option[String],

userClassPath: Seq[URL]) {

Utils.initDaemon(log)

SparkHadoopUtil.get.runAsSparkUser { () =>

// Debug code

Utils.checkHost(hostname)

// Bootstrap to fetch the driver's Spark properties.

val executorConf = new SparkConf

val fetcher = RpcEnv.create(

"driverPropsFetcher",

hostname,

-1,

executorConf,

new SecurityManager(executorConf),

clientMode = true)

val driver = fetcher.setupEndpointRefByURI(driverUrl)

val cfg = driver.askSync[SparkAppConfig](RetrieveSparkAppConfig)

val props = cfg.sparkProperties ++ Seq[(String, String)](("spark.app.id", appId))

fetcher.shutdown()

// Create SparkEnv using properties we fetched from the driver.

val driverConf = new SparkConf()

for ((key, value) <- props) {

// this is required for SSL in standalone mode

if (SparkConf.isExecutorStartupConf(key)) {

driverConf.setIfMissing(key, value)

} else {

driverConf.set(key, value)

}

cfg.hadoopDelegationCreds.foreach { tokens =>

SparkHadoopUtil.get.addDelegationTokens(tokens, driverConf)

}

val env = SparkEnv.createExecutorEnv(

driverConf, executorId, hostname, cores, cfg.ioEncryptionKey, isLocal = false)

env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(

env.rpcEnv, driverUrl, executorId, hostname, cores, userClassPath, env))

workerUrl.foreach { url =>

env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))

}

env.rpcEnv.awaitTermination()

}

private[spark] class CoarseGrainedExecutorBackend(

override val rpcEnv: RpcEnv,

driverUrl: String,

executorId: String,

hostname: String,

cores: Int,

userClassPath: Seq[URL],

env: SparkEnv)

extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {

private[this] val stopping = new AtomicBoolean(false)

var executor: Executor = null

@volatile var driver: Option[RpcEndpointRef] = None

// If this CoarseGrainedExecutorBackend is changed to support multiple threads, then this may need

// to be changed so that we don't share the serializer instance across threads

private[this] val ser: SerializerInstance = env.closureSerializer.newInstance()

override def onStart() {

logInfo("Connecting to driver: " + driverUrl)

rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>

// This is a very fast action so we can use "ThreadUtils.sameThread"

driver = Some(ref)

ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls))

}(ThreadUtils.sameThread).onComplete {

// This is a very fast action so we can use "ThreadUtils.sameThread"

case Success(msg) =>

// Always receive `true`. Just ignore it

case Failure(e) =>

exitExecutor(1, s"Cannot register with driver: $driverUrl", e, notifyDriver = false)

}(ThreadUtils.sameThread)

}

def extractLogUrls: Map[String, String] = {

val prefix = "SPARK_LOG_URL_"

sys.env.filterKeys(_.startsWith(prefix))

.map(e => (e._1.substring(prefix.length).toLowerCase(Locale.ROOT), e._2))

}

override def receive: PartialFunction[Any, Unit] = {

case RegisteredExecutor =>

logInfo("Successfully registered with driver")

try {

executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)

} catch {

case NonFatal(e) =>

exitExecutor(1, "Unable to create executor due to " + e.getMessage, e)

}

case RegisterExecutorFailed(message) =>

exitExecutor(1, "Slave registration failed: " + message)

case LaunchTask(data) =>

if (executor == null) {

exitExecutor(1, "Received LaunchTask command but executor was null")

} else {

val taskDesc = TaskDescription.decode(data.value)

logInfo("Got assigned task " + taskDesc.taskId)

executor.launchTask(this, taskDesc)

}

case KillTask(taskId, _, interruptThread, reason) =>

if (executor == null) {

exitExecutor(1, "Received KillTask command but executor was null")

} else {

executor.killTask(taskId, interruptThread, reason)

}

case StopExecutor =>

stopping.set(true)

logInfo("Driver commanded a shutdown")

// Cannot shutdown here because an ack may need to be sent back to the caller. So send

// a message to self to actually do the shutdown.

self.send(Shutdown)

case Shutdown =>

stopping.set(true)

new Thread("CoarseGrainedExecutorBackend-stop-executor") {

override def run(): Unit = {

// executor.stop() will call `SparkEnv.stop()` which waits until RpcEnv stops totally.

// However, if `executor.stop()` runs in some thread of RpcEnv, RpcEnv won't be able to

// stop until `executor.stop()` returns, which becomes a dead-lock (See SPARK-14180).

// Therefore, we put this line in a new thread.

executor.stop()

}

}.start()

}

override def onDisconnected(remoteAddress: RpcAddress): Unit = {

if (stopping.get()) {

logInfo(s"Driver from $remoteAddress disconnected during shutdown")

} else if (driver.exists(_.address == remoteAddress)) {

exitExecutor(1, s"Driver $remoteAddress disassociated! Shutting down.", null,

notifyDriver = false)

} else {

logWarning(s"An unknown ($remoteAddress) driver disconnected.")

}

override def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer) {

val msg = StatusUpdate(executorId, taskId, state, data)

driver match {

case Some(driverRef) => driverRef.send(msg)

case None => logWarning(s"Drop $msg because has not yet connected to driver")

}

/**

* This function can be overloaded by other child classes to handle

* executor exits differently. For e.g. when an executor goes down,

* back-end may not want to take the parent process down.

protected def exitExecutor(code: Int,

reason: String,

throwable: Throwable = null,

notifyDriver: Boolean = true) = {

val message = "Executor self-exiting due to : " + reason

if (throwable != null) {

logError(message, throwable)

} else {

logError(message)

}

if (notifyDriver && driver.nonEmpty) {

driver.get.ask[Boolean](

RemoveExecutor(executorId, new ExecutorLossReason(reason))

).onFailure { case e =>

logWarning(s"Unable to notify the driver due to " + e.getMessage, e)

}(ThreadUtils.sameThread)

}

System.exit(code)

}

Spark-源码分析02-Luanch Executor的更多相关文章

Spark源码分析：多种部署方式之间的区别与联系（转）
原文链接:Spark源码分析:多种部署方式之间的区别与联系(1) 从官方的文档我们可以知道,Spark的部署方式有很多种:local.Standalone.Mesos.YARN.....不同部署方式的 ...
Spark源码分析（三）-TaskScheduler创建
原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/3879151.html 在SparkContext创建过程中会调用createTaskScheduler函 ...
【转】Spark源码分析之-deploy模块
原文地址:http://jerryshao.me/architecture/2013/04/30/Spark%E6%BA%90%E7%A0%81%E5%88%86%E6%9E%90%E4%B9%8B- ...
Spark 源码分析 -- task实际执行过程
Spark源码分析 – SparkContext 中的例子, 只分析到sc.runJob 那么最终是怎么执行的? 通过DAGScheduler切分成Stage, 封装成taskset, 提交给Task ...
Spark源码分析 – 汇总索引
http://jerryshao.me/categories.html#architecture-ref http://blog.csdn.net/pelick/article/details/172 ...
Spark源码分析 – BlockManager
参考, Spark源码分析之-Storage模块对于storage, 为何Spark需要storage模块?为了cache RDD Spark的特点就是可以将RDD cache在memory或dis ...
Spark源码分析 -- TaskScheduler
Spark在设计上将DAGScheduler和TaskScheduler完全解耦合, 所以在资源管理和task调度上可以有更多的方案现在支持, LocalSheduler, ClusterSched ...
Spark源码分析 – SchedulerBackend
SchedulerBackend, 两个任务, 申请资源和task执行和管理对于SparkDeploySchedulerBackend, 基于actor模式, 主要就是启动和管理两个actor De ...
Spark源码分析 – Deploy
参考, Spark源码分析之-deploy模块 Client Client在SparkDeploySchedulerBackend被start的时候, 被创建, 代表一个application和s ...
Spark源码分析之九：内存管理模型
Spark是现在很流行的一个基于内存的分布式计算框架,既然是基于内存,那么自然而然的,内存的管理就是Spark存储管理的重中之重了.那么,Spark究竟采用什么样的内存管理模型呢?本文就为大家揭开Sp ...

随机推荐

js中对字符串操作的常见方法(1)
String类型创建一个String类型的实例 var stringObject = new String("hello world"); String类型的属性 length; ...
Hive学习笔记（一）——概述
1.Hive是个什么玩意? Hive:由Facebook开源用于解决海量结构化日志的数据统计. Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据(有规律的数据)文件映射为一张表,并提供 ...
kubernetes使用securityContext和sysctl
前言在运行一个容器时,有时候需要使用sysctl修改内核参数,比如net..vm..kernel等,sysctl需要容器拥有超级权限,容器启动时加上--privileged参数即可.那么,在kube ...
使用HttpClient调用接口
一,编写返回对象 public class HttpResult { // 响应的状态码 private int code; // 响应的响应体 private String body;get/set ...
使pre的内容自动换行(转)小知识
<pre> 元素可定义预格式化的文本.被包围在 pre 元素中的文本通常会保留空格和换行符.而文本也会呈现为等宽字体. <pre> 标签的一个常见应用就是用来表示计算机的源代码 ...
pandas-07 DataFrame修改index、columns名的方法
pandas-07 DataFrame修改index.columns名的方法一般常用的有两个方法: 1.使用DataFrame.index = [newName],DataFrame.columns ...
vue 数据更新问题
在uni-app构建选项卡时,方法中改变的数组数据无法更新v-if中的布尔值在函数中打印出来是修改成功了,但在页面中并没有进行响应布局如下: <swiper :current=" ...
Firebird 事务隔离级别
各种RDBMS事务隔离都差不多,Firebird 中大致分为3类: CONCURRENCY.READ_COMMITTED.CONSISTENCY. 在提供的数据库驱动里可设置的事务隔离级别大致如下3类 ...
java中的参数传递
Java中只有传值调用(值传递),没有传址调用(址传递或者引用传递).所以在java方法中改变参数的值是不会改变原变量的值的,但为什么改变引用变量的属性值却可以呢?请看下面的解答. java中的数据类 ...
MongoDB 设置参数
服务器配置文件分析 bin目录下的mongod.cfg是服务器的配置文件,文件中主要的配置参数: 1.数据库文件的存放位置 2.服务器日志文件的存放位置 3.默认的IP地址.端口号设置密码默认情况 ...

Spark-源码分析02-Luanch Executor

Spark-源码分析02-Luanch Executor的更多相关文章

随机推荐

热门专题