spark checkpoint机制

首先rdd.checkpoint()本身并没有执行任何的写操作，只是做checkpointDir是否为空，然后生成一个ReliableRDDCheckpointData对象checkpointData，这个对象完成checkpoint的大部分工作。

/**

    * 只是生成了一个ReliableRDDCheckpointData的对象，并没有具体的实质操作

    * Mark this RDD for checkpointing. It will be saved to a file inside the checkpoint

    * directory set with `SparkContext#setCheckpointDir` and all references to its parent

    * RDDs will be removed. This function must be called before any job has been

    * executed on this RDD. It is strongly recommended that this RDD is persisted in

    * memory, otherwise saving it on a file will require recomputation.

    */

  def checkpoint(): Unit = RDDCheckpointData.synchronized {

    // NOTE: we use a global lock here due to complexities downstream with ensuring

    // children RDD partitions point to the correct parent partitions. In the future

    // we should revisit this consideration.

    if (context.checkpointDir.isEmpty) {

      throw new SparkException("Checkpoint directory has not been set in the SparkContext")

    } else if (checkpointData.isEmpty) {

      checkpointData = Some(new ReliableRDDCheckpointData(this))

    }

  }

真正触发checkpoint操作的是rdd调用完checkpoint之后执行完的第一个action操作。

  /**

    * Run a function on a given set of partitions in an RDD and pass the results to the given

    * handler function. This is the main entry point for all actions in Spark.

    */

  def runJob[T, U: ClassTag](

                              rdd: RDD[T],

                              func: (TaskContext, Iterator[T]) => U,

                              partitions: Seq[Int],

                              resultHandler: (Int, U) => Unit): Unit = {

    if (stopped.get()) {

      throw new IllegalStateException("SparkContext has been shutdown")

    }

    val callSite = getCallSite

    val cleanedFunc = clean(func)

    logInfo("Starting job: " + callSite.shortForm)

    if (conf.getBoolean("spark.logLineage", false)) {

      logInfo("RDD's recursive dependencies:\n" + rdd.toDebugString)

    }

    dagScheduler.runJob(rdd, cleanedFunc, partitions, callSite, resultHandler, localProperties.get)

    progressBar.foreach(_.finishAll())

    rdd.doCheckpoint()

  }

其中调用rdd.doCheckpoint()，doCheckpoint代码如下：

/**

    * Performs the checkpointing of this RDD by saving this. It is called after a job using this RDD

    * has completed (therefore the RDD has been materialized and potentially stored in memory).

    * doCheckpoint() is called recursively on the parent RDDs.

    *

    * checkpointData.get.checkpoint()方法执行具体的写操作，由sc的action触发。如果本身没有checkpoint就根据依赖关系依次往上找。

    */

  private[spark] def doCheckpoint(): Unit = {

    RDDOperationScope.withScope(sc, "checkpoint", allowNesting = false, ignoreParent = true) {

      if (!doCheckpointCalled) {

        doCheckpointCalled = true

        if (checkpointData.isDefined) {

          if (checkpointAllMarkedAncestors) {

            // TODO We can collect all the RDDs that needs to be checkpointed, and then checkpoint

            // them in parallel.

            // Checkpoint parents first because our lineage will be truncated after we

            // checkpoint ourselves

            dependencies.foreach(_.rdd.doCheckpoint())

          }

          checkpointData.get.checkpoint()

        } else {

          dependencies.foreach(_.rdd.doCheckpoint())

        }

      }

    }

  }

其中checkpointData.get.checkpoint执行了最基本的写任务，docheckpoint的任务职能是如果该rdd执行过checkpoint操作，如果是把该RDD的祖先都checkpoint了，那么就根据依赖关系一次checkpoint操作。如果RDD本身没有

调用过checkpoint操作，那么就根据依赖关系一次checkpoint操作。

接下来看checkpointData.get.checkpoint的具体实现，其中主要功能在于ReliableCheckpointRDD.writeRDDToCheckpointDirectory(rdd, cpDir)方法。

  /**

    * Materialize this RDD and write its content to a reliable DFS.

    * This is called immediately after the first action invoked on this RDD has completed.

    *

    * writeRDDToCheckpointDirectory方法将RDD写到指定目录

    */

  protected override def doCheckpoint(): CheckpointRDD[T] = {

    val newRDD = ReliableCheckpointRDD.writeRDDToCheckpointDirectory(rdd, cpDir)

    // Optionally clean our checkpoint files if the reference is out of scope

    if (rdd.conf.getBoolean("spark.cleaner.referenceTracking.cleanCheckpoints", false)) {

      rdd.context.cleaner.foreach { cleaner =>

        cleaner.registerRDDCheckpointDataForCleanup(newRDD, rdd.id)

      }

    }

    logInfo(s"Done checkpointing RDD ${rdd.id} to $cpDir, new parent is RDD ${newRDD.id}")

    newRDD

  }

以下是ReliableCheckpointRDD.writeRDDToCheckpointDirectory(rdd, cpDir)的方法实现。主要包含两本分，写partition数据和写partitioner。具体如下：

  /**

    * Write RDD to checkpoint files and return a ReliableCheckpointRDD representing the RDD.

    * 写RDD到hdfs,包括partition数据和partitioner数据

    */

  def writeRDDToCheckpointDirectory[T: ClassTag](

                                                  originalRDD: RDD[T],

                                                  checkpointDir: String,

                                                  blockSize: Int = -1): ReliableCheckpointRDD[T] = {

    val sc = originalRDD.sparkContext

    // Create the output path for the checkpoint

    val checkpointDirPath = new Path(checkpointDir)

    val fs = checkpointDirPath.getFileSystem(sc.hadoopConfiguration)

    if (!fs.mkdirs(checkpointDirPath)) {

      throw new SparkException(s"Failed to create checkpoint path $checkpointDirPath")

    }

    // Save to file, and reload it as an RDD

    val broadcastedConf = sc.broadcast(

      new SerializableConfiguration(sc.hadoopConfiguration))

    // TODO: This is expensive because it computes the RDD again unnecessarily (SPARK-8582)

    sc.runJob(originalRDD,

      writePartitionToCheckpointFile[T](checkpointDirPath.toString, broadcastedConf) _)

    if (originalRDD.partitioner.nonEmpty) {

      writePartitionerToCheckpointDir(sc, originalRDD.partitioner.get, checkpointDirPath)

    }

    val newRDD = new ReliableCheckpointRDD[T](

      sc, checkpointDirPath.toString, originalRDD.partitioner)

    if (newRDD.partitions.length != originalRDD.partitions.length) {

      throw new SparkException(

        s"Checkpoint RDD $newRDD(${newRDD.partitions.length}) has different " +

          s"number of partitions from original RDD $originalRDD(${originalRDD.partitions.length})")

    }

    newRDD

  }

写partition数据：

sc.runJob(originalRDD,

      writePartitionToCheckpointFile[T](checkpointDirPath.toString, broadcastedConf) _)

/**

    * Write an RDD partition's data to a checkpoint file.

    */

  def writePartitionToCheckpointFile[T: ClassTag](

                                                   path: String,

                                                   broadcastedConf: Broadcast[SerializableConfiguration],

                                                   blockSize: Int = -1)(ctx: TaskContext, iterator: Iterator[T]) {

    val env = SparkEnv.get

    val outputDir = new Path(path)

    val fs = outputDir.getFileSystem(broadcastedConf.value.value)

    val finalOutputName = ReliableCheckpointRDD.checkpointFileName(ctx.partitionId())

    val finalOutputPath = new Path(outputDir, finalOutputName)

    val tempOutputPath =

      new Path(outputDir, s".$finalOutputName-attempt-${ctx.attemptNumber()}")

    val bufferSize = env.conf.getInt("spark.buffer.size", 65536)

    val fileOutputStream = if (blockSize < 0) {

      fs.create(tempOutputPath, false, bufferSize)

    } else {

      // This is mainly for testing purpose

      fs.create(tempOutputPath, false, bufferSize,

        fs.getDefaultReplication(fs.getWorkingDirectory), blockSize)

    }

    val serializer = env.serializer.newInstance()

    val serializeStream = serializer.serializeStream(fileOutputStream)

    Utils.tryWithSafeFinally {

      serializeStream.writeAll(iterator)

    } {

      serializeStream.close()

    }

    if (!fs.rename(tempOutputPath, finalOutputPath)) {

      if (!fs.exists(finalOutputPath)) {

        logInfo(s"Deleting tempOutputPath $tempOutputPath")

        fs.delete(tempOutputPath, false)

        throw new IOException("Checkpoint failed: failed to save output of task: " +

          s"${ctx.attemptNumber()} and final output path does not exist: $finalOutputPath")

      } else {

        // Some other copy of this task must've finished before us and renamed it

        logInfo(s"Final output path $finalOutputPath already exists; not overwriting it")

        if (!fs.delete(tempOutputPath, false)) {

          logWarning(s"Error deleting ${tempOutputPath}")

        }

      }

    }

  }

111

写partitioner如下：

/**

    * Write a partitioner to the given RDD checkpoint directory. This is done on a best-effort

    * basis; any exception while writing the partitioner is caught, logged and ignored.

    */

  private def writePartitionerToCheckpointDir(

                                               sc: SparkContext, partitioner: Partitioner, checkpointDirPath: Path): Unit = {

    try {

      val partitionerFilePath = new Path(checkpointDirPath, checkpointPartitionerFileName)

      val bufferSize = sc.conf.getInt("spark.buffer.size", 65536)

      val fs = partitionerFilePath.getFileSystem(sc.hadoopConfiguration)

      val fileOutputStream = fs.create(partitionerFilePath, false, bufferSize)

      val serializer = SparkEnv.get.serializer.newInstance()

      val serializeStream = serializer.serializeStream(fileOutputStream)

      Utils.tryWithSafeFinally {

        serializeStream.writeObject(partitioner)

      } {

        serializeStream.close()

      }

      logDebug(s"Written partitioner to $partitionerFilePath")

    } catch {

      case NonFatal(e) =>

        logWarning(s"Error writing partitioner $partitioner to $checkpointDirPath")

    }

  }

spark checkpoint机制的更多相关文章

Spark checkpoint机制简述
本文主要简述spark checkpoint机制,快速把握checkpoint机制的来龙去脉,至于源码可以参考我的下一篇文章. 1.Spark core的checkpoint 1)为什么checkpo ...
深入浅出Spark的Checkpoint机制
1 Overview 当第一次碰到 Spark,尤其是 Checkpoint 的时候难免有点一脸懵逼,不禁要问,Checkpoint 到底是什么.所以,当我们在说 Checkpoint 的时候,我们到 ...
Spark cache、checkpoint机制笔记
Spark学习笔记总结 03. Spark cache和checkpoint机制 1. RDD cache缓存当持久化某个RDD后,每一个节点都将把计算的分片结果保存在内存中,并在对此RDD或衍生出 ...
60、Spark Streaming：缓存与持久化机制、Checkpoint机制
一.缓存与持久化机制与RDD类似,Spark Streaming也可以让开发人员手动控制,将数据流中的数据持久化到内存中.对DStream调用persist()方法,就可以让Spark Stream ...
RDD之七：Spark容错机制
引入一般来说,分布式数据集的容错性有两种方式:数据检查点和记录数据的更新. 面向大规模数据分析,数据检查点操作成本很高,需要通过数据中心的网络连接在机器之间复制庞大的数据集,而网络带宽往往比内存带宽 ...
【Spark】Spark容错机制
引入一般来说,分布式数据集的容错性有两种方式:数据检查点和记录数据的更新. 面向大规模数据分析,数据检查点操作成本非常高,须要通过数据中心的网络连接在机器之间复制庞大的数据集,而网络带宽往往比内存带 ...
Spark检查点机制
Spark中对于数据的保存除了持久化操作之外,还提供了一种检查点的机制,检查点(本质是通过将RDD写入Disk做检查点)是为了通过lineage(血统)做容错的辅助,lineage过长会造成容错成本过 ...
【mysql】关于checkpoint机制
一.简介思考一下这个场景:如果重做日志可以无限地增大,同时缓冲池也足够大,那么是不需要将缓冲池中页的新版本刷新回磁盘.因为当发生宕机时,完全可以通过重做日志来恢复整个数据库系统中的数据到宕机发生的时 ...
Spark工作机制简述
Spark工作机制主要模块调度与任务分配 I/O模块通信控制模块容错模块 Shuffle模块调度层次应用作业 Stage Task 调度算法 FIFO FAIR(公平调度) Spark应 ...

随机推荐

创建Maven项目出现：An internal error occurred during: "Retrieving archetypes:". Java heap space 错误解决办法
首先说明一下网上的方法: 在Eclipse中创建Maven的Web项目时出现错误:An internal error occurred during: "Retrieving archety ...
[BZOJ2453]维护队列|分块
Description 你小时候玩过弹珠吗? 小朋友A有一些弹珠,A喜欢把它们排成队列,从左到右编号为1到N.为了整个队列鲜艳美观,小朋友想知道某一段连续弹珠中,不同颜色的弹珠有多少.当然,A有时候会 ...
[bzoj4518][Sdoi2016]征途-斜率优化
Brief Description Pine开始了从S地到T地的征途. 从S地到T地的路可以划分成n段,相邻两段路的分界点设有休息站. Pine计划用m天到达T地.除第m天外,每一天晚上Pine都必须 ...
【C语言】一次内存泄露的分析的记录
今天运行一个程序,程序刚启动时占用内存很小,在运行过程中发现占用的内存会一直增大. 用cat /proc/pid/statm的方式查看发现也确实在一直增大. 而且这个程序移植到另外一个平台后,会直接无 ...
python基础===python3 get和post请求(转载)
get请求 #encoding:UTF-8 importurllib importurllib.request data={} data['name']='aaa' url_parame=urllib ...
js面向对象编程（一）：封装(转载)
一. 生成对象的原始模式假定我们把猫看成一个对象,它有"名字"和"颜色"两个属性. var Cat = { name : '', color : '' } 现 ...
点击回到之前页面，并不刷新js histroy
history是你浏览过的网页的url(简单的说就是网址)的集合,也就是你的浏览器里的那个历史记录.它在js里是一个内置对象,就跟document一样,它有自己的方法,go就是其中一个. 这个方法的参 ...
项目问题整理（it）
1,很(屎)优(一)雅(样)的IE9兼容问题: --webuploader在webkit浏览器中自动使用h5上传,但在IE中需要配置支持flash,特别注意两点: ①Upload.swf路径问题不正确 ...
python的算法：二分法查找（1）
1.什么是二分法查找: 1.从数组的中间元素开始,如果中间元素正好是要查找的元素,则搜素过程结束: 2.如果某一特定元素大于或者小于中间元素,则在数组大于或小于中间元素的那一半中查找,而且跟开始一样从 ...
Vuejs1.0学习
1.数据双向绑定双向绑定以后,表单中数据的改变会同步改变H2中的输出 2.v-show 内容输入前: 内容输入后:隐藏提示,展示按钮代码实现: 此处的v-show可以换成v-if,v-show是隐 ...

spark checkpoint机制

spark checkpoint机制的更多相关文章

随机推荐

热门专题