spark1.3.x与spark2.x启动executor不同的cpu core分配方式
***这里的executor在worker上分配策略以spreadOut 为例***
1.3版本关键点:
for (app <- waitingApps if app.coresLeft > 0) { //对还未被完全分配资源的apps处理
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
.filter(canUse(app, _)).sortBy(_.coresFree).reverse //根据core Free对可用Worker进行降序排序。
val numUsable = usableWorkers.length //可用worker的个数 eg:可用5个worker
val assigned = new Array[Int](numUsable) //候选Worker,每个Worker一个下标,是一个数组,初始化默认都是0
var toAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)//还要分配的cores = 集群中可用Worker的可用cores总和(10), 当前未分配core(5)中找最小的
var pos = 0
while (toAssign > 0) {
if (usableWorkers(pos).coresFree - assigned(pos) > 0) { //以round robin方式在所有可用Worker里判断当前worker空闲cpu是否大于当前数组已经分配core值
toAssign -= 1
assigned(pos) += 1 //当前下标pos的Worker分配1个core +1
}
pos = (pos + 1) % numUsable //round-robin轮询寻找有资源的Worker
}
// Now that we've decided how many cores to give on each node, let's actually give them
for (pos <- 0 until numUsable) {
if (assigned(pos) > 0) { //如果assigned数组中的值>0,将启动一个executor在,指定下标的机器上。
val exec = app.addExecutor(usableWorkers(pos), assigned(pos)) //更新app里的Executor信息
launchExecutor(usableWorkers(pos), exec) //通知可用Worker去启动Executor
app.state = ApplicationState.RUNNING
}
}
}
以上红色代码清晰的展示了在平均分配的场景下,每次会给worker分配1个core,所以说在spark-submit中如果设置了 --executor-cores属性未必起作用;
但在2.x版本的spark中却做了这方面的矫正,它确实会去读取--executor-cores属性中的值,如果该值未设置则依然按照1.3.x的方式执行,代码如下:
private def scheduleExecutorsOnWorkers(
app: ApplicationInfo,
usableWorkers: Array[WorkerInfo],
spreadOutApps: Boolean): Array[Int] = {
val coresPerExecutor = app.desc.coresPerExecutor
val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
val oneExecutorPerWorker = coresPerExecutor.isEmpty
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val numUsable = usableWorkers.length
val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum) /** Return whether the specified worker can launch an executor for this app. */
def canLaunchExecutor(pos: Int): Boolean = {
val keepScheduling = coresToAssign >= minCoresPerExecutor
val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor // If we allow multiple executors per worker, then we can always launch new executors.
// Otherwise, if there is already an executor on this worker, just give it more cores.
val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutors(pos) == 0
if (launchingNewExecutor) {
val assignedMemory = assignedExecutors(pos) * memoryPerExecutor
val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
keepScheduling && enoughCores && enoughMemory && underLimit
} else {
// We're adding cores to an existing executor, so no need
// to check memory and executor limits
keepScheduling && enoughCores
}
} // Keep launching executors until no more workers can accommodate any
// more executors, or if we have reached this application's limits
var freeWorkers = (0 until numUsable).filter(canLaunchExecutor)
while (freeWorkers.nonEmpty) {
freeWorkers.foreach { pos =>
var keepScheduling = true
while (keepScheduling && canLaunchExecutor(pos)) {
coresToAssign -= minCoresPerExecutor
assignedCores(pos) += minCoresPerExecutor // If we are launching one executor per worker, then every iteration assigns 1 core
// to the executor. Otherwise, every iteration assigns cores to a new executor.
if (oneExecutorPerWorker) {
assignedExecutors(pos) = 1
} else {
assignedExecutors(pos) += 1
} // Spreading out an application means spreading out its executors across as
// many workers as possible. If we are not spreading out, then we should keep
// scheduling executors on this worker until we use all of its resources.
// Otherwise, just move on to the next worker.
if (spreadOutApps) {
keepScheduling = false
}
}
}
freeWorkers = freeWorkers.filter(canLaunchExecutor)
}
assignedCores
}
spark1.3.x与spark2.x启动executor不同的cpu core分配方式的更多相关文章
- worker启动executor源码分析-executor.clj
在"supervisor启动worker源码分析-worker.clj"一文中,我们详细讲解了worker是如何初始化的.主要通过调用mk-worker函数实现的.在启动worke ...
- [Spark内核] 第32课:Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等
本課主題 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 [引言部份:你希望读者 ...
- Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等
本课主题 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 Spark Worke ...
- Android Activity启动黑/白屏原因与解决方式
Android Activity启动黑/白屏原因与解决方式 我们新建一个HelloWorld项目,运行在手机上时,Activity打开之前会有一个动画,而这个动画里是全白或者全黑的(取决于你的主题是亮 ...
- docker 启动报错:Docker.Core.Backend.BackendException: Error response from daemon: open \\.\pipe\docker_e
win10 docker启动后报错: Docker.Core.Backend.BackendException:Error response from daemon: open \\.\pipe\do ...
- 04_线程的创建和启动_使用Callable和Future的方式
[简述] 从java5开始,java提供了Callable接口,这个接口可以是Runnable接口的增强版, Callable接口提供了一个call()方法作为线程执行体,call()方法比run() ...
- web容器启动后自动执行程序的几种方式比较
1. 背景 1.1. 背景介绍 在web项目中我们有时会遇到这种需求,在web项目启动后需要开启线程去完成一些重要的工作,例如:往数据库中初始化一些数据,开启线程,初始化消息队 ...
- pythoncharm 中解决启动server时出现 “django.core.exceptions.ImproperlyConfigured: Requested setting DEBUG, but settings are not configured”的错误
背景介绍 最近,尝试着用pythoncharm 这个All-star IDE来搞一搞Django,于是乎,下载专业版,PJ等等一系列操作之后,终于得偿所愿.可以开工了. 错误 在园子里找了一篇初学者的 ...
- Oracle数据库启动出现ORA-27101错误之ORA-19815处理方式及数据备份
ORA-27101: sharedmemory realm does not exist之ORA-19815处理 重启数据库(数据库:muphy),登陆是越到错误: ORA-27101: shared ...
随机推荐
- java学习教程与笔记
一个java学习教程:http://www.jikexueyuan.com/path/java/#stage1 集合类学习: java中结合类很多,但用得比较多的一般有三种,当然,其它语言也是,主要是 ...
- Linux 虚拟机上安装linux系统 (ip:子网掩码,网关,dns,交换机,路由知识回顾)
一 安装虚拟机 二 虚拟机上配置好在安装linux系统 三 知识回顾 交换机:主机在局域网内的身份是MAC地址(可以通过[交换机广播:交换机通过被动学习来建立一张“接口号”和“MAC地址”的对照表]或 ...
- Groovy闭包
定义 闭包(Closure)是一种数据类型,它代表一段可执行的代码.它可以作为方法的参数,或者返回值,也可以独立运行,定义如下: def xxx = {parameters -> code} ...
- Python编码规范(PEP8)
Introduction 介绍 本文提供的Python代码编码规范基于Python主要发行版本的标准库.Python的C语言实现的C代码规范请查看相应的PEP指南1. 这篇文档以及PEP 257(文档 ...
- scrapy 通过FormRequest模拟登录再继续
1.参考 https://doc.scrapy.org/en/latest/topics/spiders.html#scrapy.spiders.Spider.start_requests 自动提交 ...
- linux 硬盘满了如何处理
事件源于在服务器运行一个脚本程序… 好好的脚本突然报错,还以为脚本出现问题了.细看报错原因(具体报错信息已经忘记了),是没有可用空间.从没遇见过这个情况,怎么办呢? 一.确定是不是真的是磁盘空间不足 ...
- Codeforces 1109D Sasha and Interesting Fact from Graph Theory (看题解) 组合数学
Sasha and Interesting Fact from Graph Theory n 个 点形成 m 个有标号森林的方案数为 F(n, m) = m * n ^ {n - 1 - m} 然后就 ...
- python运算符——逻辑运算符
not命令是取反命令,真的变成假的,假的变成真的(True是真的,False是假的) b = Trueprint(not b) 原本是真的,但是加了“not”指令就变成了假的,not指令是一元运算符, ...
- Servlet(六):Cookie
Cookie 学习:问题: HTTP 协议是没有记忆功能的,一次请求结束后,相关数据会被销毁.如果第二次的请求需要使用相同的请求数据怎么办呢?难道是让用户再次请求书写吗?解决: 使用 Cookie 技 ...
- 【转】Spring总结以及在面试中的一些问题
[转]Spring总结以及在面试中的一些问题. 1.谈谈你对spring IOC和DI的理解,它们有什么区别? IoC Inverse of Control 反转控制的概念,就是将原本在程序中手动创建 ...