spark1.1.0源码阅读-taskScheduler
1. sparkContext中设置createTaskScheduler
case "yarn-standalone" | "yarn-cluster" =>
if (master == "yarn-standalone") {
logWarning(
"\"yarn-standalone\" is deprecated as of Spark 1.0. Use \"yarn-cluster\" instead.")
}
val scheduler = try {
val clazz = Class.forName("org.apache.spark.scheduler.cluster.YarnClusterScheduler")
val cons = clazz.getConstructor(classOf[SparkContext])
cons.newInstance(sc).asInstanceOf[TaskSchedulerImpl]
} catch {
// TODO: Enumerate the exact reasons why it can fail
// But irrespective of it, it means we cannot proceed !
case e: Exception => {
throw new SparkException("YARN mode not available ?", e)
}
}
val backend = new CoarseGrainedSchedulerBackend(scheduler, sc.env.actorSystem)
scheduler.initialize(backend) //调用实现类的initialize函数
scheduler
在taskSchedulerImpl.scala中
def initialize(backend: SchedulerBackend) {
this.backend = backend
// temporarily set rootPool name to empty
rootPool = new Pool("", schedulingMode, 0, 0)
schedulableBuilder = {
schedulingMode match {
case SchedulingMode.FIFO =>
new FIFOSchedulableBuilder(rootPool)
case SchedulingMode.FAIR =>
new FairSchedulableBuilder(rootPool, conf)
}
}
schedulableBuilder.buildPools()
}
2. submitTasks
override def submitTasks(taskSet: TaskSet) {
val tasks = taskSet.tasks
logInfo("Adding task set " + taskSet.id + " with " + tasks.length + " tasks")
this.synchronized {
val manager = new TaskSetManager(this, taskSet, maxTaskFailures)
activeTaskSets(taskSet.id) = manager
schedulableBuilder.addTaskSetManager(manager, manager.taskSet.properties) if (!isLocal && !hasReceivedTask) {
starvationTimer.scheduleAtFixedRate(new TimerTask() {
override def run() {
if (!hasLaunchedTask) {
logWarning("Initial job has not accepted any resources; " +
"check your cluster UI to ensure that workers are registered " +
"and have sufficient memory")
} else {
this.cancel()
}
}
}, STARVATION_TIMEOUT, STARVATION_TIMEOUT)
}
hasReceivedTask = true
}
backend.reviveOffers()
}
3. CoarseGrainedSchedulerBackend的reviveOffers
override def reviveOffers() {
driverActor ! ReviveOffers //将msg发给CoarseGrainedSchedulerBackend的driverActor
}
case ReviveOffers =>
makeOffers()
// Make fake resource offers on all executors
def makeOffers() {
launchTasks(scheduler.resourceOffers(
executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))}))
}
/**
2 * Represents free resources available on an executor.
*/
private[spark]
case class WorkerOffer(executorId: String, host: String, cores: Int)
1 /**
2 * Called by cluster manager to offer resources on slaves. We respond by asking our active task
3 * sets for tasks in order of priority. We fill each node with tasks in a round-robin manner so
4 * that tasks are balanced across the cluster.
5 */
def resourceOffers(offers: Seq[WorkerOffer]): Seq[Seq[TaskDescription]] = synchronized {
SparkEnv.set(sc.env) // Mark each slave as alive and remember its hostname
for (o <- offers) {
executorIdToHost(o.executorId) = o.host
if (!executorsByHost.contains(o.host)) {
executorsByHost(o.host) = new HashSet[String]()
executorAdded(o.executorId, o.host)
}
} // Randomly shuffle offers to avoid always placing tasks on the same set of workers.
val shuffledOffers = Random.shuffle(offers)
// Build a list of tasks to assign to each worker.
21 val tasks = shuffledOffers.map(o => new ArrayBuffer[TaskDescription](o.cores))
val availableCpus = shuffledOffers.map(o => o.cores).toArray
val sortedTaskSets = rootPool.getSortedTaskSetQueue
for (taskSet <- sortedTaskSets) {
logDebug("parentName: %s, name: %s, runningTasks: %s".format(
taskSet.parent.name, taskSet.name, taskSet.runningTasks))
} // Take each TaskSet in our scheduling order, and then offer it each node in increasing order
// of locality levels so that it gets a chance to launch local tasks on all of them.
var launchedTask = false
for (taskSet <- sortedTaskSets; maxLocality <- TaskLocality.values) {
do {
launchedTask = false
for (i <- 0 until shuffledOffers.size) {
val execId = shuffledOffers(i).executorId
val host = shuffledOffers(i).host
if (availableCpus(i) >= CPUS_PER_TASK) {
for (task <- taskSet.resourceOffer(execId, host, maxLocality)) {
tasks(i) += task
val tid = task.taskId
taskIdToTaskSetId(tid) = taskSet.taskSet.id
taskIdToExecutorId(tid) = execId
activeExecutorIds += execId
executorsByHost(host) += execId
availableCpus(i) -= CPUS_PER_TASK
assert (availableCpus(i) >= 0)
launchedTask = true
}
}
}
} while (launchedTask)
} if (tasks.size > 0) {
hasLaunchedTask = true
}
58 return tasks
}
4. launchTasks
// Launch tasks returned by a set of resource offers
def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
for (task <- tasks.flatten) {
freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
executorActor(task.executorId) ! LaunchTask(task)
}
}
class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: ActorSystem)
extends SchedulerBackend with Logging
{
// Use an atomic variable to track total number of cores in the cluster for simplicity and speed
var totalCoreCount = new AtomicInteger(0)
val conf = scheduler.sc.conf
private val timeout = AkkaUtils.askTimeout(conf) class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor {
private val executorActor = new HashMap[String, ActorRef]
private val executorAddress = new HashMap[String, Address]
private val executorHost = new HashMap[String, String]
private val freeCores = new HashMap[String, Int]
private val totalCores = new HashMap[String, Int]
private val addressToExecutorId = new HashMap[Address, String]
// Driver to executors
case class LaunchTask(task: TaskDescription) extends CoarseGrainedClusterMessage
private[spark] class TaskDescription(
2 val taskId: Long,
3 val executorId: String,
4 val name: String,
5 val index: Int, // Index within this task's TaskSet
6 _serializedTask: ByteBuffer)
7 extends Serializable { // Because ByteBuffers are not serializable, wrap the task in a SerializableBuffer
private val buffer = new SerializableBuffer(_serializedTask) def serializedTask: ByteBuffer = buffer.value override def toString: String = "TaskDescription(TID=%d, index=%d)".format(taskId, index)
}
5. CoarseGrainedSchedulerBackend收到executor的注册之后,记录executor
def receive = {
case RegisterExecutor(executorId, hostPort, cores) =>
Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
if (executorActor.contains(executorId)) {
sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId)
} else {
logInfo("Registered executor: " + sender + " with ID " + executorId)
sender ! RegisteredExecutor(sparkProperties)
executorActor(executorId) = sender
executorHost(executorId) = Utils.parseHostPort(hostPort)._1
totalCores(executorId) = cores
freeCores(executorId) = cores
executorAddress(executorId) = sender.path.address
addressToExecutorId(sender.path.address) = executorId
totalCoreCount.addAndGet(cores)
makeOffers()
}
executor先向CoarseGrainedSchedulerBackend注册,然后CoarseGrainedSchedulerBackend发task(序列化后)到这个executor上去。
6. CoarseGrainedExecutorBackend跟CoarseGrainedSchedulerBackend通信。
private[spark] class CoarseGrainedExecutorBackend(
driverUrl: String,
executorId: String,
hostPort: String,
cores: Int,
sparkProperties: Seq[(String, String)])
7 extends Actor with ActorLogReceive with ExecutorBackend with Logging { Utils.checkHostPort(hostPort, "Expected hostport") var executor: Executor = null
var driver: ActorSelection = null override def preStart() {
logInfo("Connecting to driver: " + driverUrl)
driver = context.actorSelection(driverUrl)
17 driver ! RegisterExecutor(executorId, hostPort, cores) //注册
context.system.eventStream.subscribe(self, classOf[RemotingLifecycleEvent])
} override def receiveWithLogging = {
case RegisteredExecutor =>
logInfo("Successfully registered with driver")
// Make this host instead of hostPort ?
executor = new Executor(executorId, Utils.parseHostPort(hostPort)._1, sparkProperties,
false) case RegisterExecutorFailed(message) =>
logError("Slave registration failed: " + message)
System.exit(1) case LaunchTask(data) => //收到task
if (executor == null) {
logError("Received LaunchTask command but executor was null")
System.exit(1)
} else {
val ser = SparkEnv.get.closureSerializer.newInstance()
val taskDesc = ser.deserialize[TaskDescription](data.value)
logInfo("Got assigned task " + taskDesc.taskId)
executor.launchTask(this, taskDesc.taskId, taskDesc.name, taskDesc.serializedTask)
}
7. executor.launchTask
def launchTask(
context: ExecutorBackend, taskId: Long, taskName: String, serializedTask: ByteBuffer) {
val tr = new TaskRunner(context, taskId, taskName, serializedTask)
runningTasks.put(taskId, tr)
threadPool.execute(tr)
}
且听下回分解
spark1.1.0源码阅读-taskScheduler的更多相关文章
- spark1.1.0源码阅读-dagscheduler and stage
1. rdd action ->sparkContext.runJob->dagscheduler.runJob def runJob[T, U: ClassTag]( rdd: RDD[ ...
- spark1.1.0源码阅读-executor
1. executor上执行launchTask def launchTask( context: ExecutorBackend, taskId: Long, taskName: String, s ...
- Yii2.0源码阅读-一次请求的完整过程
Yii2.0框架源码阅读,从请求发起,到结束的运行步骤 其实最初阅读是从yii\web\UrlManager这个类开始看起,不断的寻找这个类中方法的调用者,最终回到了yii\web\Applicati ...
- Vue2.0源码阅读笔记(四):nextTick
在阅读 nextTick 的源码之前,要先弄明白 JS 执行环境运行机制,介绍 JS 执行环境的事件循环机制的文章很多,大部分都阐述的比较笼统,甚至有些文章说的是错误的,以下为个人理解,如有错误, ...
- Vue2.0源码阅读笔记--生命周期
一.Vue2.0的生命周期 Vue2.0的整个生命周期有八个:分别是 1.beforeCreate,2.created,3.beforeMount,4.mounted,5.beforeUpdate,6 ...
- Vue2.0源码阅读笔记--双向绑定实现原理
上一篇 文章 了解了Vue.js的生命周期.这篇分析Observe Data过程,了解Vue.js的双向数据绑定实现原理. 一.实现双向绑定的做法 前端MVVM最令人激动的就是双向绑定机制了,实现双向 ...
- Yii2.0源码阅读-从路由到控制器
之前的文章弄清了一次请求的开始到结束.主要讲了Yii Applicaton实例的创建.初始化,UrlManager如何返回Yii中的路由信息,到runAction,最后将Response发送给客户端. ...
- Yii2.0源码阅读-视图(View)渲染过程
之前的文章我们根据源码的分析,弄清了Yii如何处理一次请求,以及根据解析的路由如何调用控制器中的action,那接下来好奇的可能就是,我在控制器action中执行了return $this->r ...
- Vue2.0源码阅读笔记(二):响应式原理
Vue是数据驱动的框架,在修改数据时,视图会进行更新.数据响应式系统使得状态管理变的简单直接,在开发过程中减少与DOM元素的接触.而深入学习其中的原理十分有必要,能够回避一些常见的问题,使开发变的 ...
随机推荐
- F - Rain on your Parade - hdu 2389(二分图匹配,Hk算法)
题意:给一些人和一些伞的坐标,然后每个人都有一定的速度,还有多少时间就会下雨,问最多能有多少人可以拿到伞. 分析:题意很明确,可以用每个人和伞判断一下是否能够达到,如果能就建立一个联系.不过这道题的数 ...
- I - Navigation Nightmare-poj 1984
约翰和他的邻居生活在一个村庄里,他们的道路修建的很特别,都是正东正西或者正南正北,但是呢他们用一种方式描述他们和邻居的位置,比如说 6号 在1号 东面13处,那么我们就可以计算出来这两家的曼哈顿距离, ...
- 573 The Snail(蜗牛)
The Snail A snail is at the bottom of a 6-foot well and wants to climb to the top. The snail can ...
- java中基于TaskEngine类封装实现定时任务
主要包括如下几个类: 文章标题:java中基于TaskEngine类封装实现定时任务 文章地址: http://blog.csdn.net/5iasp/article/details/10950529 ...
- Android学习–Android app 语言切换功能
功能: app用户根据自己的语言喜好,设置app语言.语言设置只针对本app,并在下次启动应用时保留前一次启动设置. 更新语言: public static void changeAppLanguag ...
- C# 将MSMQ消息转换成Json格式 【优化】
C# 将MSMQ消息转换成Json格式 [优化] 转换函数: private string ConvertToJSON(string label, string body) { //TODO: co ...
- cas 官方文档
1. 架构 http://jasig.github.io/cas/4.0.0/planning/Architecture.html System Components The CAS server a ...
- automatically select architectures
各位在用XCode 5.x 打开用XCode 4.x 创建的项目时候.会遇到编译器警告automatically select architectures. 1. This is because th ...
- JavaScript基础(二)
一.外部引用语法<script src="script.js"></script> 二.在页面中的位置 1.我们可以将JavaScript代码放在html文 ...
- hdu 2190
//hdu2190 水题 题意是给一个n*3的教室,用1*1,2*2的砖去铺满,有多少种铺法,一开始没发现这个规律,想了一下,应该是递归. #include <iostream> usi ...