SparkDriver 用于提交用户的应用程序,

一、SparkConf

负责SparkContext的配置参数加载, 主要通过ConcurrentHashMap来维护各种`spark.*`的配置属性

class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging with Serializable {

    import SparkConf._

    /** Create a SparkConf that loads defaults from system properties and the classpath */
def this() = this(true) /**
* 维护一个ConcurrentHashMap 来存储spark配置
*/
private val settings = new ConcurrentHashMap[String, String]() @transient private lazy val reader: ConfigReader = {
val _reader = new ConfigReader(new SparkConfigProvider(settings))
_reader.bindEnv(new ConfigProvider {
override def get(key: String): Option[String] = Option(getenv(key))
})
_reader
} if (loadDefaults) {
loadFromSystemProperties(false)
} /**
* 加载spark.*的配置
* @param silent
* @return
*/
private[spark] def loadFromSystemProperties(silent: Boolean): SparkConf = {
// Load any spark.* system properties, 只加载spark.*的配置
for ((key, value) <- Utils.getSystemProperties if key.startsWith("spark.")) {
set(key, value, silent)
}
this
}
}

二、SparkContext

2.1、创建Spark执行环境SparkEnv

SparkEnv是Spark的执行环境对象, 其中包括众多与Executor执行相关的对象。

创建, 主要通过SparkEnv.createSparkEnv, SparkContext初始化,只创建SparkEnv

  def isLocal: Boolean = Utils.isLocalMaster(_conf)

  // An asynchronous listener bus for Spark events
//采用监听器模式维护各类事件的处理
private[spark] val listenerBus = new LiveListenerBus(this) // This function allows components created by SparkEnv to be mocked in unit tests:
private[spark] def createSparkEnv(
conf: SparkConf,
isLocal: Boolean,
listenerBus: LiveListenerBus): SparkEnv = {
//创建DriverEnv
SparkEnv.createDriverEnv(conf, isLocal, listenerBus, SparkContext.numDriverCores(master))
}

继续进入createDriverEnv, 发现调用的是create方法, 该方法是为Driver或Executor创建SparkEnv

点击createExecutorEnv发现是CoarseGrainedExecutorBackend调用

下面具体看看create()中做了什么操作

2.1.1、创建SecurityManager

    //创建SecurityManager
val securityManager = new SecurityManager(conf, ioEncryptionKey)
ioEncryptionKey.foreach { _ =>
if (!securityManager.isSaslEncryptionEnabled()) {
logWarning("I/O encryption enabled without RPC encryption: keys will be visible on the " +
"wire.")
}
}

2.1.2、创建RpcEnv

    val systemName = if (isDriver) driverSystemName else executorSystemName
val rpcEnv = RpcEnv.create(systemName, bindAddress, advertiseAddress, port, conf,
securityManager, clientMode = !isDriver)

2.1.3、通过反射创建序列化器, 此处默认创建JavaSerializer

    // Create an instance of the class with the given name, possibly initializing it with our conf
def instantiateClass[T](className: String): T = {
val cls = Utils.classForName(className)
// Look for a constructor taking a SparkConf and a boolean isDriver, then one taking just
// SparkConf, then one taking no arguments
try {
cls.getConstructor(classOf[SparkConf], java.lang.Boolean.TYPE)
.newInstance(conf, new java.lang.Boolean(isDriver))
.asInstanceOf[T]
} catch {
case _: NoSuchMethodException =>
try {
cls.getConstructor(classOf[SparkConf]).newInstance(conf).asInstanceOf[T]
} catch {
case _: NoSuchMethodException =>
cls.getConstructor().newInstance().asInstanceOf[T]
}
}
} // Create an instance of the class named by the given SparkConf property, or defaultClassName
// if the property is not set, possibly initializing it with our conf
def instantiateClassFromConf[T](propertyName: String, defaultClassName: String): T = {
instantiateClass[T](conf.get(propertyName, defaultClassName))
} val serializer = instantiateClassFromConf[Serializer](
"spark.serializer", "org.apache.spark.serializer.JavaSerializer")
logDebug(s"Using serializer: ${serializer.getClass}")

2.1.3、创建SerializeManager

    val serializerManager = new SerializerManager(serializer, conf, ioEncryptionKey)

    val closureSerializer = new JavaSerializer(conf)

2.1.4、创建BroadcastManager

  val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)

2.1.5、创建MapOutputTracker

    def registerOrLookupEndpoint(
name: String, endpointCreator: => RpcEndpoint):
RpcEndpointRef = {
if (isDriver) {
logInfo("Registering " + name)
rpcEnv.setupEndpoint(name, endpointCreator)
} else {
RpcUtils.makeDriverRef(name, conf, rpcEnv)
}
} val broadcastManager = new BroadcastManager(isDriver, conf, securityManager) //创建MapOutputTracker 区分Driver, Executor
val mapOutputTracker = if (isDriver) {
//Driver需要BroadcastManager
new MapOutputTrackerMaster(conf, broadcastManager, isLocal)
} else {
new MapOutputTrackerWorker(conf)
} // Have to assign trackerEndpoint after initialization as MapOutputTrackerEndpoint
// requires the MapOutputTracker itself
mapOutputTracker.trackerEndpoint = registerOrLookupEndpoint(MapOutputTracker.ENDPOINT_NAME,
new MapOutputTrackerMasterEndpoint(
rpcEnv, mapOutputTracker.asInstanceOf[MapOutputTrackerMaster], conf))

2.1.6、创建ShuffleManager

    // Let the user specify short names for shuffle managers
val shortShuffleMgrNames = Map(
"sort" -> classOf[org.apache.spark.shuffle.sort.SortShuffleManager].getName,
"tungsten-sort" -> classOf[org.apache.spark.shuffle.sort.SortShuffleManager].getName)
val shuffleMgrName = conf.get("spark.shuffle.manager", "sort")
val shuffleMgrClass = shortShuffleMgrNames.getOrElse(shuffleMgrName.toLowerCase, shuffleMgrName)
val shuffleManager = instantiateClass[ShuffleManager](shuffleMgrClass)

2.1.7、创建 BlockManager

    val useLegacyMemoryManager = conf.getBoolean("spark.memory.useLegacyMode", false)
val memoryManager: MemoryManager =
if (useLegacyMemoryManager) {
new StaticMemoryManager(conf, numUsableCores)
} else {
UnifiedMemoryManager(conf, numUsableCores)
} val blockManagerPort = if (isDriver) {
conf.get(DRIVER_BLOCK_MANAGER_PORT)
} else {
conf.get(BLOCK_MANAGER_PORT)
} val blockTransferService =
new NettyBlockTransferService(conf, securityManager, bindAddress, advertiseAddress,
blockManagerPort, numUsableCores) val blockManagerMaster = new BlockManagerMaster(registerOrLookupEndpoint(
BlockManagerMaster.DRIVER_ENDPOINT_NAME,
new BlockManagerMasterEndpoint(rpcEnv, isLocal, conf, listenerBus)),
conf, isDriver) // NB: blockManager is not valid until initialize() is called later.
val blockManager = new BlockManager(executorId, rpcEnv, blockManagerMaster,
serializerManager, conf, memoryManager, mapOutputTracker, shuffleManager,
blockTransferService, securityManager, numUsableCores)

2.1.8、创建MetricsSystem

    val metricsSystem = if (isDriver) {
// Don't start metrics system right now for Driver.
// We need to wait for the task scheduler to give us an app ID.
// Then we can start the metrics system.
MetricsSystem.createMetricsSystem("driver", conf, securityManager)
} else {
// We need to set the executor ID before the MetricsSystem is created because sources and
// sinks specified in the metrics configuration file will want to incorporate this executor's
// ID into the metrics they report.
conf.set("spark.executor.id", executorId)
val ms = MetricsSystem.createMetricsSystem("executor", conf, securityManager)
ms.start()
ms
}

2.1.9、创建SparkEnv实例

    val envInstance = new SparkEnv(
executorId,
rpcEnv,
serializer,
closureSerializer,
serializerManager,
mapOutputTracker,
shuffleManager,
broadcastManager,
blockManager,
securityManager,
metricsSystem,
memoryManager,
outputCommitCoordinator,
conf)

2.1.10、创建临时文件

    // Add a reference to tmp dir created by driver, we will delete this tmp dir when stop() is
// called, and we only need to do it for driver. Because driver may run as a service, and if we
// don't delete this tmp dir when sc is stopped, then will create too many tmp dirs.
if (isDriver) {
val sparkFilesDir = Utils.createTempDir(Utils.getLocalDir(conf), "userFiles").getAbsolutePath
envInstance.driverTmpDir = Some(sparkFilesDir)
}

创建Spark执行环境SparkEnv的更多相关文章

  1. Spark源码剖析 - SparkContext的初始化(二)_创建执行环境SparkEnv

    2. 创建执行环境SparkEnv SparkEnv是Spark的执行环境对象,其中包括众多与Executor执行相关的对象.由于在local模式下Driver会创建Executor,local-cl ...

  2. Spark 核心篇-SparkEnv

    本章内容: 1.功能概述 SparkEnv是Spark的执行环境对象,其中包括与众多Executor执行相关的对象.Spark 对任务的计算都依托于 Executor 的能力,所有的 Executor ...

  3. javaScript执行环境、作用域链与闭包

    一.执行环境 执行环境定义了变量和函数有权访问的其他数据,决定了他们各自的行为:每个执行环境都有一个与之关联的变量对象,环境中定义的所有变量和函数都保存在这个对象中.虽然我们编写的代码无法访问这个对象 ...

  4. VO、AO、执行环境和作用域链

    1.变量对象(variable object) 原文:Every execution context has associated with it a variable object. Variabl ...

  5. 理解JS的执行环境

    执行环境(Execution context,EC)或执行上下文,是JS中一个极为重要的概念 EC的组成 当JavaScript代码执行的时候,会进入不同的执行上下文,这些执行上下文会构成了一个执行上 ...

  6. Javascript高级编程学习笔记(9)—— 执行环境

    今天主要讲一下,JS底层的一些东西,这些东西不太好举例(应该是我水平不够) 望大家多多海涵,比心心 执行环境 执行环境(执行上下文,全文使用执行环境 )是JS中最为重要的一个概念,执行环境决定了,变量 ...

  7. (O)JS:执行环境、变量对象、活动对象和作用域链(原创)

    var a=1; function b(x){ var c=2; console.log(x); } b(3); ·执行环境(execution context),也称为环境.执行上下文.上下文环境. ...

  8. Javascript 函数及其执行环境和作用域

    函数在javascript中可以说是一等公民,也是最有意思的事情,javascript函数其实也是一个对象,是Function类型的实例.因此声明一个函数首先可以使用 Function构造函数: va ...

  9. js的闭包中关于执行环境和作用链的理解

    首先讲一讲执行环境: 执行环境按照字面上来理解就是指目前代码执行所在的环境. 当JavaScript代码执行的时候,会进入不同的执行上下文,这些执行上下文会构成了一个执行上下文栈(Execution ...

随机推荐

  1. bzoj 3993 星际战争

    题目大意: X军团和Y军团正在激烈地作战  在战斗的某一阶段,Y军团一共派遣了N个巨型机器人进攻X军团的阵地,其中第i个巨型机器人的装甲值为Ai 当一个巨型机器人的装甲值减少到0或者以下时,这个巨型机 ...

  2. JZOJ 5461 购物 —— 贪心

    题目:https://jzoj.net/senior/#main/show/5461 贪心,原来想了个思路,优先选优惠价最小的 K 个,然后其他按原价排序遍历: 如果当前物品没选过,原价选上,如果选过 ...

  3. JSP-Runood:JSP 客户端请求

    ylbtech-JSP-Runood:JSP 客户端请求 1.返回顶部 1. JSP 客户端请求 当浏览器请求一个网页时,它会向网络服务器发送一系列不能被直接读取的信息,因为这些信息是作为HTTP信息 ...

  4. 当下较热web前端技术汇总

    Web前段技术发展很快,主流技术日新月异,想想自己刚毕业那会用的asp技术,现在已经很少有主流网站在使用了.再到后来的J2EE框架,然后SpringMVC大行其道,但是最近各种js框架被广为传播,Ht ...

  5. [App Store Connect帮助]三、管理 App 和版本(2.5)输入 App 信息:本地化 App Store 信息

    在添加 App 至您的帐户之后,您可以在“App 信息”页面添加语言并输入本地化元数据.若要查看受支持的语言列表,请参见 App Store 本地化.若要了解您可以本地化的属性,请参见必填项.可本地化 ...

  6. JavaScript(JS)的简单使用

    一.什么是JS(Javascript)? Javascript是一种脚本语言,被广泛用于Web应用开发,常用来为网页添加各式各样的功能,为用户提供更加流畅的浏览效果. Javascript严格区分大小 ...

  7. 多个@bean无法通过@resource注入对应的bean(org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type 'javax.sql.DataSource' available: expected single matching bean but found )

    一.异常 org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type ' ...

  8. akka设计模式系列(Actor模型)

    谈到Akka就必须介绍Actor并发模型,而谈到Actor就必须看一篇叫做<A Universal Modular Actor Formalism for Artificial Intellig ...

  9. 解决Logger在Android Studio 3.1版本无法正常加载tag格式

    已经升级到Android Studio 3.1的同学可能会发现一个问题, Logcat中如果短时间出现多条日志tag相同, 只会显示第一条日志的tag, 后面的tag会自动隐藏, 这时com.orha ...

  10. CentOS 7.0 firewall防火墙关闭firewall作为防火墙,这里改为iptables防火墙

    CentOS 7.0默认使用的是firewall作为防火墙,这里改为iptables防火墙步骤: 1.先检查是否安装了: iptables service iptables status 2.安装ip ...