spark-streaming的checkpoint机制源码分析
转发请注明原创地址 http://www.cnblogs.com/dongxiao-yang/p/7994357.html
spark-streaming定时对 DStreamGraph
和 JobScheduler
做 Checkpoint,来记录整个 DStreamGraph
的变化和每个 batch 的 job 的完成情况,Checkpoint 发起的间隔默认的是和 batchDuration 一致;即每次 batch 发起、提交了需要运行的 job 后就做 Checkpoint。另外在 job 完成了更新任务状态的时候再次做一下 Checkpoint。
一 checkpoint生成
job生成
- private def generateJobs(time: Time) {
- // Checkpoint all RDDs marked for checkpointing to ensure their lineages are
- // truncated periodically. Otherwise, we may run into stack overflows (SPARK-6847).
- ssc.sparkContext.setLocalProperty(RDD.CHECKPOINT_ALL_MARKED_ANCESTORS, "true")
- Try {
- jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch
- graph.generateJobs(time) // generate jobs using allocated block
- } match {
- case Success(jobs) =>
- val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time)
- jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos))
- case Failure(e) =>
- jobScheduler.reportError("Error generating jobs for time " + time, e)
- PythonDStream.stopStreamingContextIfPythonProcessIsDead(e)
- }
- eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false))
- }
job 完成
- private def clearMetadata(time: Time) {
- ssc.graph.clearMetadata(time)
- // If checkpointing is enabled, then checkpoint,
- // else mark batch to be fully processed
- if (shouldCheckpoint) {
- eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = true))
- } else {
- // If checkpointing is not enabled, then delete metadata information about
- // received blocks (block data not saved in any case). Otherwise, wait for
- // checkpointing of this batch to complete.
- val maxRememberDuration = graph.getMaxInputStreamRememberDuration()
- jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration)
- jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration)
- markBatchFullyProcessed(time)
- }
- }
上文里面的eventLoop是JobGenerator内部的一个消息事件队列的封装,eventLoop内部会有一个后台线程不断的去消费事件,所以DoCheckpoint这种类型的事件会经过processEvent ->
doCheckpoint 由checkpointWriter把生成的Checkpoint对象写到外部存储:
- /** Processes all events */
- private def processEvent(event: JobGeneratorEvent) {
- logDebug("Got event " + event)
- event match {
- case GenerateJobs(time) => generateJobs(time)
- case ClearMetadata(time) => clearMetadata(time)
- case DoCheckpoint(time, clearCheckpointDataLater) =>
- doCheckpoint(time, clearCheckpointDataLater)
- case ClearCheckpointData(time) => clearCheckpointData(time)
- }
- }
- /** Perform checkpoint for the give `time`. */
- private def doCheckpoint(time: Time, clearCheckpointDataLater: Boolean) {
- if (shouldCheckpoint && (time - graph.zeroTime).isMultipleOf(ssc.checkpointDuration)) {
- logInfo("Checkpointing graph for time " + time)
- ssc.graph.updateCheckpointData(time)
- checkpointWriter.write(new Checkpoint(ssc, time), clearCheckpointDataLater)
- }
- }
doCheckpoint在调用checkpointWriter写数据到hdfs之前,首先会运行一下ssc.graph.updateCheckpointData(time),这个方法的主要作用是更新DStreamGraph里所有input和output stream对应的checkpointData属性,调用链路为DStreamGraph.updateCheckpointData -> Dstream.updateCheckpointData -> checkpointData.update
- def updateCheckpointData(time: Time) {
- logInfo("Updating checkpoint data for time " + time)
- this.synchronized {
- outputStreams.foreach(_.updateCheckpointData(time))
- }
- logInfo("Updated checkpoint data for time " + time)
- }
- private[streaming] def updateCheckpointData(currentTime: Time) {
- logDebug(s"Updating checkpoint data for time $currentTime")
- checkpointData.update(currentTime)
- dependencies.foreach(_.updateCheckpointData(currentTime))
- logDebug(s"Updated checkpoint data for time $currentTime: $checkpointData")
- }
- private[streaming]
- class DirectKafkaInputDStreamCheckpointData extends DStreamCheckpointData(this) {
- def batchForTime: mutable.HashMap[Time, Array[(String, Int, Long, Long)]] = {
- data.asInstanceOf[mutable.HashMap[Time, Array[OffsetRange.OffsetRangeTuple]]]
- }
- override def update(time: Time): Unit = {
- batchForTime.clear()
- generatedRDDs.foreach { kv =>
- val a = kv._2.asInstanceOf[KafkaRDD[K, V]].offsetRanges.map(_.toTuple).toArray
- batchForTime += kv._1 -> a
- }
- }
- override def cleanup(time: Time): Unit = { }
- override def restore(): Unit = {
- batchForTime.toSeq.sortBy(_._1)(Time.ordering).foreach { case (t, b) =>
- logInfo(s"Restoring KafkaRDD for time $t ${b.mkString("[", ", ", "]")}")
- generatedRDDs += t -> new KafkaRDD[K, V](
- context.sparkContext,
- executorKafkaParams,
- b.map(OffsetRange(_)),
- getPreferredHosts,
- // during restore, it's possible same partition will be consumed from multiple
- // threads, so dont use cache
- false
- )
- }
- }
- }
以DirectKafkaInputDStream为例,代码里重写了checkpointData的update等接口,所以DirectKafkaInputDStream会在checkpoint之前把正在运行的kafkaRDD对应的topic,partition,fromoffset,untiloffset全部存储到checkpointData里面data这个HashMap的属性当中,用于写checkpoint时进行序列化。
一个checkpoint里面包含的对象列表如下:
- class Checkpoint(ssc: StreamingContext, val checkpointTime: Time)
- extends Logging with Serializable {
- val master = ssc.sc.master
- val framework = ssc.sc.appName
- val jars = ssc.sc.jars
- val graph = ssc.graph
- val checkpointDir = ssc.checkpointDir
- val checkpointDuration = ssc.checkpointDuration
- val pendingTimes = ssc.scheduler.getPendingTimes().toArray
- val sparkConfPairs = ssc.conf.getAll
二 从checkpoint恢复服务
spark-streaming启用checkpoint代码里的StreamingContext必须严格按照官方demo实例的架构使用,即所有的streaming逻辑都放在一个返回StreamingContext的createContext方法上,
通过StreamingContext.getOrCreate方法进行初始化,在CheckpointReader.read找到checkpoint文件并且成功反序列化出checkpoint对象后,返回一个包含该checkpoint对象的StreamingContext,这个时候程序里的createContext就不会被调用,反之如果程序是第一次启动会通过createContext初始化StreamingContext
- def getOrCreate(
- checkpointPath: String,
- creatingFunc: () => StreamingContext,
- hadoopConf: Configuration = SparkHadoopUtil.get.conf,
- createOnError: Boolean = false
- ): StreamingContext = {
- val checkpointOption = CheckpointReader.read(
- checkpointPath, new SparkConf(), hadoopConf, createOnError)
- checkpointOption.map(new StreamingContext(null, _, null)).getOrElse(creatingFunc())
- }
- def read(
- checkpointDir: String,
- conf: SparkConf,
- hadoopConf: Configuration,
- ignoreReadError: Boolean = false): Option[Checkpoint] = {
- val checkpointPath = new Path(checkpointDir)
- val fs = checkpointPath.getFileSystem(hadoopConf)
- // Try to find the checkpoint files
- val checkpointFiles = Checkpoint.getCheckpointFiles(checkpointDir, Some(fs)).reverse
- if (checkpointFiles.isEmpty) {
- return None
- }
- // Try to read the checkpoint files in the order
- logInfo(s"Checkpoint files found: ${checkpointFiles.mkString(",")}")
- var readError: Exception = null
- checkpointFiles.foreach { file =>
- logInfo(s"Attempting to load checkpoint from file $file")
- try {
- val fis = fs.open(file)
- val cp = Checkpoint.deserialize(fis, conf)
- logInfo(s"Checkpoint successfully loaded from file $file")
- logInfo(s"Checkpoint was generated at time ${cp.checkpointTime}")
- return Some(cp)
- } catch {
- case e: Exception =>
- readError = e
- logWarning(s"Error reading checkpoint from file $file", e)
- }
- }
- // If none of checkpoint files could be read, then throw exception
- if (!ignoreReadError) {
- throw new SparkException(
- s"Failed to read checkpoint from directory $checkpointPath", readError)
- }
- None
- }
- }
在从checkpoint恢复的过程中DStreamGraph由checkpoint恢复,下文的代码调用路径StreamingContext.graph->DStreamGraph.restoreCheckpointData -> DStream.restoreCheckpointData->checkpointData.restore
- private[streaming] val graph: DStreamGraph = {
- if (isCheckpointPresent) {
- _cp.graph.setContext(this)
- _cp.graph.restoreCheckpointData()
- _cp.graph
- } else {
- require(_batchDur != null, "Batch duration for StreamingContext cannot be null")
- val newGraph = new DStreamGraph()
- newGraph.setBatchDuration(_batchDur)
- newGraph
- }
- }
- def restoreCheckpointData() {
- logInfo("Restoring checkpoint data")
- this.synchronized {
- outputStreams.foreach(_.restoreCheckpointData())
- }
- logInfo("Restored checkpoint data")
- }
- private[streaming] def restoreCheckpointData() {
- if (!restoredFromCheckpointData) {
- // Create RDDs from the checkpoint data
- logInfo("Restoring checkpoint data")
- checkpointData.restore()
- dependencies.foreach(_.restoreCheckpointData())
- restoredFromCheckpointData = true
- logInfo("Restored checkpoint data")
- }
- }
- override def restore(): Unit = {
- batchForTime.toSeq.sortBy(_._1)(Time.ordering).foreach { case (t, b) =>
- logInfo(s"Restoring KafkaRDD for time $t ${b.mkString("[", ", ", "]")}")
- generatedRDDs += t -> new KafkaRDD[K, V](
- context.sparkContext,
- executorKafkaParams,
- b.map(OffsetRange(_)),
- getPreferredHosts,
- // during restore, it's possible same partition will be consumed from multiple
- // threads, so dont use cache
- false
- )
- }
- }
仍然以DirectKafkaInputDStreamCheckpointData为例,这个方法从上文保存的checkpoint.data对象里获取RDD运行时的对应信息恢复出停止时的KafkaRDD。
- private def restart() {
- // If manual clock is being used for testing, then
- // either set the manual clock to the last checkpointed time,
- // or if the property is defined set it to that time
- if (clock.isInstanceOf[ManualClock]) {
- val lastTime = ssc.initialCheckpoint.checkpointTime.milliseconds
- val jumpTime = ssc.sc.conf.getLong("spark.streaming.manualClock.jump", 0)
- clock.asInstanceOf[ManualClock].setTime(lastTime + jumpTime)
- }
- val batchDuration = ssc.graph.batchDuration
- // Batches when the master was down, that is,
- // between the checkpoint and current restart time
- val checkpointTime = ssc.initialCheckpoint.checkpointTime
- val restartTime = new Time(timer.getRestartTime(graph.zeroTime.milliseconds))
- val downTimes = checkpointTime.until(restartTime, batchDuration)
- logInfo("Batches during down time (" + downTimes.size + " batches): "
- + downTimes.mkString(", "))
- // Batches that were unprocessed before failure
- val pendingTimes = ssc.initialCheckpoint.pendingTimes.sorted(Time.ordering)
- logInfo("Batches pending processing (" + pendingTimes.length + " batches): " +
- pendingTimes.mkString(", "))
- // Reschedule jobs for these times
- val timesToReschedule = (pendingTimes ++ downTimes).filter { _ < restartTime }
- .distinct.sorted(Time.ordering)
- logInfo("Batches to reschedule (" + timesToReschedule.length + " batches): " +
- timesToReschedule.mkString(", "))
- timesToReschedule.foreach { time =>
- // Allocate the related blocks when recovering from failure, because some blocks that were
- // added but not allocated, are dangling in the queue after recovering, we have to allocate
- // those blocks to the next batch, which is the batch they were supposed to go.
- jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch
- jobScheduler.submitJobSet(JobSet(time, graph.generateJobs(time)))
- }
- // Restart the timer
- timer.start(restartTime.milliseconds)
- logInfo("Restarted JobGenerator at " + restartTime)
- }
最后,在restart的过程中,JobGenerator通过当前时间和上次程序停止的时间计算出程序重启过程中共有多少batch没有生成,加上上一次程序死掉的过程中有多少正在运行的job,全部
进行Reschedule,补跑任务。
参考文档
2Spark Streaming揭秘 Day33 checkpoint的使用
spark-streaming的checkpoint机制源码分析的更多相关文章
- 《深入理解Spark:核心思想与源码分析》(前言及第1章)
自己牺牲了7个月的周末和下班空闲时间,通过研究Spark源码和原理,总结整理的<深入理解Spark:核心思想与源码分析>一书现在已经正式出版上市,目前亚马逊.京东.当当.天猫等网站均有销售 ...
- 《深入理解Spark:核心思想与源码分析》(第2章)
<深入理解Spark:核心思想与源码分析>一书前言的内容请看链接<深入理解SPARK:核心思想与源码分析>一书正式出版上市 <深入理解Spark:核心思想与源码分析> ...
- 《深入理解Spark:核心思想与源码分析》一书正式出版上市
自己牺牲了7个月的周末和下班空闲时间,通过研究Spark源码和原理,总结整理的<深入理解Spark:核心思想与源码分析>一书现在已经正式出版上市,目前亚马逊.京东.当当.天猫等网站均有销售 ...
- 《深入理解Spark:核心思想与源码分析》正式出版上市
自己牺牲了7个月的周末和下班空闲时间,通过研究Spark源码和原理,总结整理的<深入理解Spark:核心思想与源码分析>一书现在已经正式出版上市,目前亚马逊.京东.当当.天猫等网站均有销售 ...
- Spark Streaming揭秘 Day26 JobGenerator源码图解
Spark Streaming揭秘 Day26 JobGenerator源码图解 今天主要解析一下JobGenerator,它相当于一个转换器,和机器学习的pipeline比较类似,因为最终运行在Sp ...
- 《深入理解Spark:核心思想与源码分析》——SparkContext的初始化(叔篇)——TaskScheduler的启动
<深入理解Spark:核心思想与源码分析>一书前言的内容请看链接<深入理解SPARK:核心思想与源码分析>一书正式出版上市 <深入理解Spark:核心思想与源码分析> ...
- Spark Streaming揭秘 Day22 架构源码图解
Spark Streaming揭秘 Day22 架构源码图解 今天主要是通过图解的方式,对SparkStreaming的架构进行一下回顾. 下面这个是其官方标准的流程描述. SparkStreamin ...
- Springboot学习04-默认错误页面加载机制源码分析
Springboot学习04-默认错误页面加载机制源码分析 前沿 希望通过本文的学习,对错误页面的加载机制有这更神的理解 正文 1-Springboot错误页面展示 2-Springboot默认错误处 ...
- ApplicationEvent事件机制源码分析
<spring扩展点之三:Spring 的监听事件 ApplicationListener 和 ApplicationEvent 用法,在spring启动后做些事情> <服务网关zu ...
随机推荐
- AxureRP7超强部件库打包下载
摘要: 很多刚刚开始学习Axure的朋友都喜欢到网上搜罗各种部件库(组件库)widgets library ,但是网络中真正实用的并且适合你使用的少之又少,最好的办法就是自己制作适合自己工作内容的部件 ...
- 分析成绩 Exercise07_04
import java.util.Scanner; /** * @author 冰樱梦 * 时间:2018年下半年 * 题目:分析成绩 * */ public class Exercise07_04 ...
- React Native使用Navigator组件进行页面导航报this.props....is not a function错误
在push的时候定义回调函数: this.props.navigator.push({ component: nextVC, title: titleName, passProps: { //回调 g ...
- EF 通用数据层父类方法小结
MSSql 数据库 数据层 父类 增删改查: using System;using System.Collections.Generic;using System.Data;using System. ...
- FFmpeg学习起步 —— 环境搭建
下面是我搭建FFmpeg学习环境的步骤. 一.在Ubuntu下 从http://www.ffmpeg.org/download.html下载最新的FFmpeg版本,我的版本是ffmpeg-2.7.2. ...
- Restful Web Service部署到weblogic 12c
介绍一下环境: 首先需要下载一个jaxrs-ri-2.22.2.zip的包 采用Jdeveloper 12c版本,jdk1.8 WebLogic Server 12.2.1版本 Restful项目建立 ...
- Win7用IIS发布网站系统 部署项目
1.首先确保系统上已经安装IIS [控制面板]→[程序]→[程序和功能]→[打开或关闭Windows功能] 选中Internet 信息服务下面的所有选项,点击确定. 2. 获得发布好的程序文件 若没有 ...
- javascript快速入门22--Ajax简介
Ajax是什么? 首先,Ajax是什么?一个很酷的新兴词汇!仅仅是某种早就有了的技术的一种新说法而已! Ajax是指一种创建交互式网页应用的网页开发技术.要谈到网页应用程序,则必须从WEB的历史来讲: ...
- 深入理解AMD和RequireJS!
AMD 基于commonJS规范的nodeJS出来以后,服务端的模块概念已经形成,很自然地,大家就想要客户端模块.而且最好两者能够兼容,一个模块不用修改,在服务器和浏览器都可以运行.但是,由于一个重大 ...
- Spark(四) -- Spark工作机制
一.应用执行机制 一个应用的生命周期即,用户提交自定义的作业之后,Spark框架进行处理的一系列过程. 在这个过程中,不同的时间段里,应用会被拆分为不同的形态来执行. 1.应用执行过程中的基本组件和形 ...