spark1.3.x与spark2.x启动executor不同的cpu core分配方式
***这里的executor在worker上分配策略以spreadOut 为例***
1.3版本关键点:
for (app <- waitingApps if app.coresLeft > 0) { //对还未被完全分配资源的apps处理
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
.filter(canUse(app, _)).sortBy(_.coresFree).reverse //根据core Free对可用Worker进行降序排序。
val numUsable = usableWorkers.length //可用worker的个数 eg:可用5个worker
val assigned = new Array[Int](numUsable) //候选Worker,每个Worker一个下标,是一个数组,初始化默认都是0
var toAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)//还要分配的cores = 集群中可用Worker的可用cores总和(10), 当前未分配core(5)中找最小的
var pos = 0
while (toAssign > 0) {
if (usableWorkers(pos).coresFree - assigned(pos) > 0) { //以round robin方式在所有可用Worker里判断当前worker空闲cpu是否大于当前数组已经分配core值
toAssign -= 1
assigned(pos) += 1 //当前下标pos的Worker分配1个core +1
}
pos = (pos + 1) % numUsable //round-robin轮询寻找有资源的Worker
}
// Now that we've decided how many cores to give on each node, let's actually give them
for (pos <- 0 until numUsable) {
if (assigned(pos) > 0) { //如果assigned数组中的值>0,将启动一个executor在,指定下标的机器上。
val exec = app.addExecutor(usableWorkers(pos), assigned(pos)) //更新app里的Executor信息
launchExecutor(usableWorkers(pos), exec) //通知可用Worker去启动Executor
app.state = ApplicationState.RUNNING
}
}
}
以上红色代码清晰的展示了在平均分配的场景下,每次会给worker分配1个core,所以说在spark-submit中如果设置了 --executor-cores属性未必起作用;
但在2.x版本的spark中却做了这方面的矫正,它确实会去读取--executor-cores属性中的值,如果该值未设置则依然按照1.3.x的方式执行,代码如下:
private def scheduleExecutorsOnWorkers(
app: ApplicationInfo,
usableWorkers: Array[WorkerInfo],
spreadOutApps: Boolean): Array[Int] = {
val coresPerExecutor = app.desc.coresPerExecutor
val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
val oneExecutorPerWorker = coresPerExecutor.isEmpty
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val numUsable = usableWorkers.length
val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum) /** Return whether the specified worker can launch an executor for this app. */
def canLaunchExecutor(pos: Int): Boolean = {
val keepScheduling = coresToAssign >= minCoresPerExecutor
val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor // If we allow multiple executors per worker, then we can always launch new executors.
// Otherwise, if there is already an executor on this worker, just give it more cores.
val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutors(pos) == 0
if (launchingNewExecutor) {
val assignedMemory = assignedExecutors(pos) * memoryPerExecutor
val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
keepScheduling && enoughCores && enoughMemory && underLimit
} else {
// We're adding cores to an existing executor, so no need
// to check memory and executor limits
keepScheduling && enoughCores
}
} // Keep launching executors until no more workers can accommodate any
// more executors, or if we have reached this application's limits
var freeWorkers = (0 until numUsable).filter(canLaunchExecutor)
while (freeWorkers.nonEmpty) {
freeWorkers.foreach { pos =>
var keepScheduling = true
while (keepScheduling && canLaunchExecutor(pos)) {
coresToAssign -= minCoresPerExecutor
assignedCores(pos) += minCoresPerExecutor // If we are launching one executor per worker, then every iteration assigns 1 core
// to the executor. Otherwise, every iteration assigns cores to a new executor.
if (oneExecutorPerWorker) {
assignedExecutors(pos) = 1
} else {
assignedExecutors(pos) += 1
} // Spreading out an application means spreading out its executors across as
// many workers as possible. If we are not spreading out, then we should keep
// scheduling executors on this worker until we use all of its resources.
// Otherwise, just move on to the next worker.
if (spreadOutApps) {
keepScheduling = false
}
}
}
freeWorkers = freeWorkers.filter(canLaunchExecutor)
}
assignedCores
}
spark1.3.x与spark2.x启动executor不同的cpu core分配方式的更多相关文章
- worker启动executor源码分析-executor.clj
在"supervisor启动worker源码分析-worker.clj"一文中,我们详细讲解了worker是如何初始化的.主要通过调用mk-worker函数实现的.在启动worke ...
- [Spark内核] 第32课:Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等
本課主題 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 [引言部份:你希望读者 ...
- Spark Worker原理和源码剖析解密:Worker工作流程图、Worker启动Driver源码解密、Worker启动Executor源码解密等
本课主题 Spark Worker 原理 Worker 启动 Driver 源码鉴赏 Worker 启动 Executor 源码鉴赏 Worker 与 Master 的交互关系 Spark Worke ...
- Android Activity启动黑/白屏原因与解决方式
Android Activity启动黑/白屏原因与解决方式 我们新建一个HelloWorld项目,运行在手机上时,Activity打开之前会有一个动画,而这个动画里是全白或者全黑的(取决于你的主题是亮 ...
- docker 启动报错:Docker.Core.Backend.BackendException: Error response from daemon: open \\.\pipe\docker_e
win10 docker启动后报错: Docker.Core.Backend.BackendException:Error response from daemon: open \\.\pipe\do ...
- 04_线程的创建和启动_使用Callable和Future的方式
[简述] 从java5开始,java提供了Callable接口,这个接口可以是Runnable接口的增强版, Callable接口提供了一个call()方法作为线程执行体,call()方法比run() ...
- web容器启动后自动执行程序的几种方式比较
1. 背景 1.1. 背景介绍 在web项目中我们有时会遇到这种需求,在web项目启动后需要开启线程去完成一些重要的工作,例如:往数据库中初始化一些数据,开启线程,初始化消息队 ...
- pythoncharm 中解决启动server时出现 “django.core.exceptions.ImproperlyConfigured: Requested setting DEBUG, but settings are not configured”的错误
背景介绍 最近,尝试着用pythoncharm 这个All-star IDE来搞一搞Django,于是乎,下载专业版,PJ等等一系列操作之后,终于得偿所愿.可以开工了. 错误 在园子里找了一篇初学者的 ...
- Oracle数据库启动出现ORA-27101错误之ORA-19815处理方式及数据备份
ORA-27101: sharedmemory realm does not exist之ORA-19815处理 重启数据库(数据库:muphy),登陆是越到错误: ORA-27101: shared ...
随机推荐
- Flask+Nginx+Supervisor+Gunicorn+HTTPS部署教程(CentOs)
写在前面 之前的文章中,我们详细讲述了怎样安装 Nginx,Python,Supervisor,Gunicorn,HTTPS.经本人多次测试是完全可以跑通的,那么本篇将介绍怎样将这些组合起来运行一个H ...
- SonarLint 代码质量管理
Below are the instructions of how to install and use SonarLint. Install SonarLint Extensions in VS20 ...
- ExtJs5的基本理论概念
概述 理解ExtJs里面的一些基本关键字的概念是使用ExtJs搭建MMVC框架的基础,在ExtJs中,我们通常遇到ExtJs的配置和启动项Ext.application(),该方法是ExtJs程序初始 ...
- Codeforces.871D.Paths(莫比乌斯反演 根号分治)
题目链接 \(Description\) 给定\(n\),表示有一张\(n\)个点的无向图,两个点\(x,y\)之间有权值为\(1\)的边当且仅当\(\gcd(x,y)\neq1\).求\(1\sim ...
- git彻底删除某个文件的全部log历史记录
git filter-branch -f --tree-filter 'rm -rf vendor/gems' HEAD git push origin --force如果log历史较多 可能需要一点 ...
- DWM1000 测距原理简单分析 之 SS-TWR代码分析2 -- [蓝点无限]
蓝点DWM1000 模块已经打样测试完毕,有兴趣的可以申请购买了,更多信息参见 蓝点论坛 正文: 首先将SS 原理介绍中的图片拿过来,将图片印在脑海里. 对于DeviceA 和 DeviceB来说,初 ...
- 编程菜鸟的日记-初学尝试编程-C++ Primer Plus 第5章编程练习4
#include <iostream>using namespace std;const MAXSIZE=12;int main(){ char *month[MAXSIZE]={&quo ...
- bat入门--第一个bat文件
所谓的批处理就是从记事本开始进行的. 1.新建一个记事本文件, 2, 打开的记事本上敲入一行字:@echo off 意思:隐藏以下输入的代码(off改成on是打开代码显示). 3.再输入:echo h ...
- Java面试宝典2018
转 Java面试宝典2018 一. Java基础部分…………………………………………………………………………………….. 7 1.一个“.java”源文件中是否可以包括多个类(不是内部类)?有什么限制 ...
- python飞机大战代码
import pygame from pygame.locals import * from pygame.sprite import Sprite import random import time ...