spark 2.1.1

spark应用中有一些task非常慢,持续10个小时,有一个task日志如下:

2019-01-24 21:38:56,024 [dispatcher-event-loop-22] INFO org.apache.spark.executor.CoarseGrainedExecutorBackend - Got assigned task 4031
2019-01-24 21:38:56,024 [Executor task launch worker for task 4031] INFO org.apache.spark.executor.Executor - Running task 11.0 in stage 98.0 (TID 4031)
2019-01-24 21:38:56,050 [Executor task launch worker for task 4031] INFO org.apache.spark.MapOutputTrackerWorker - Don't have map outputs for shuffle 13, fetching them
2019-01-24 21:38:56,050 [Executor task launch worker for task 4031] INFO org.apache.spark.MapOutputTrackerWorker - Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://MapOutputTracker@server1:30384)
2019-01-24 21:38:56,052 [Executor task launch worker for task 4031] INFO org.apache.spark.MapOutputTrackerWorker - Got the output locations
2019-01-24 21:38:56,052 [Executor task launch worker for task 4031] INFO org.apache.spark.storage.ShuffleBlockFetcherIterator - Getting 200 non-empty blocks out of 200 blocks
2019-01-24 21:38:56,054 [Executor task launch worker for task 4031] INFO org.apache.spark.storage.ShuffleBlockFetcherIterator - Started 19 remote fetches in 2 ms

2019-01-25 07:07:54,200 [Executor task launch worker for task 4031] INFO org.apache.spark.storage.memory.MemoryStore - Block rdd_108_11 stored as values in memory (estimated size 222.6 MB, free 1893.2 MB)
2019-01-25 07:07:54,546 [Executor task launch worker for task 4031] INFO org.apache.spark.storage.memory.MemoryStore - Block rdd_117_11 stored as values in memory (estimated size 87.5 MB, free 1805.8 MB)
2019-01-25 07:07:54,745 [Executor task launch worker for task 4031] INFO org.apache.spark.storage.memory.MemoryStore - Block rdd_118_11 stored as values in memory (estimated size 87.5 MB, free 1718.3 MB)
2019-01-25 07:07:54,987 [Executor task launch worker for task 4031] INFO org.apache.spark.sql.hive.SparkHiveDynamicPartitionWriterContainer - Sorting complete. Writing out partition files one at a time.
2019-01-25 07:07:57,425 [Executor task launch worker for task 4031] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_20190124213852_0098_m_000011_0' to hdfs://namenode/user/hive/warehouse/
db_name.db/table_name/.hive-staging_hive_2019-01-24_21-38-52_251_7997709482427937209-1/-ext-10000/_temporary/0/task_20190124213852_0098_m_000011
2019-01-25 07:07:57,425 [Executor task launch worker for task 4031] INFO org.apache.spark.mapred.SparkHadoopMapRedUtil - attempt_20190124213852_0098_m_000011_0: Committed
2019-01-25 07:07:57,426 [Executor task launch worker for task 4031] INFO org.apache.spark.executor.Executor - Finished task 11.0 in stage 98.0 (TID 4031). 4259 bytes result sent to driver

从2019-01-24 21:38:56到2019-01-25 07:07:54之间没有任何日志,应用还没结束,当前还有一些很慢的task在运行,查看这些task所在executor的thread dump发现卡在一个线程上:

java.lang.Thread.sleep(Native Method)
app.package.AppClass.do(AppClass.scala:228)
org.apache.spark.sql.execution.MapElementsExec$$anonfun$8$$anonfun$apply$1.apply(objects.scala:237)
org.apache.spark.sql.execution.MapElementsExec$$anonfun$8$$anonfun$apply$1.apply(objects.scala:237)
scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1005)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:936)
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700)
org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:336)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:334)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1005)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:936)
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700)
org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:336)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:334)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1005)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:936)
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700)
org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:336)
org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:334)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1005)
org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:936)
org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700)
org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
org.apache.spark.scheduler.Task.run(Task.scala:99)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)

其中app.package.AppClass.do是一个很耗时的操作,会在rdd的每个element上操作一次,问题是已经在这个操作之后对rdd做了cache,为什么后续依赖这个rdd的时候又会重新计算一遍?

问题简化如下:

  1. rdd.map(item => doLongTime(item))
  2. rdd.cache
  3. //take long time
  4. println(rdd.count)
  5. //take long time too, why?
  6. println(rdd.count)

查看代码

RDD的compute由子类覆盖,通常会调用RDD.iterator

org.apache.spark.rdd.RDD

  1. /**
  2. * Internal method to this RDD; will read from cache if applicable, or otherwise compute it.
  3. * This should ''not'' be called by users directly, but is available for implementors of custom
  4. * subclasses of RDD.
  5. */
  6. final def iterator(split: Partition, context: TaskContext): Iterator[T] = {
  7. if (storageLevel != StorageLevel.NONE) {
  8. getOrCompute(split, context)
  9. } else {
  10. computeOrReadCheckpoint(split, context)
  11. }
  12. }
  13.  
  14. /**
  15. * Gets or computes an RDD partition. Used by RDD.iterator() when an RDD is cached.
  16. */
  17. private[spark] def getOrCompute(partition: Partition, context: TaskContext): Iterator[T] = {
  18. val blockId = RDDBlockId(id, partition.index)
  19. var readCachedBlock = true
  20. // This method is called on executors, so we need call SparkEnv.get instead of sc.env.
  21. SparkEnv.get.blockManager.getOrElseUpdate(blockId, storageLevel, elementClassTag, () => {
  22. readCachedBlock = false
  23. computeOrReadCheckpoint(partition, context)
  24. }) match {
  25. case Left(blockResult) =>
  26. if (readCachedBlock) {
  27. val existingMetrics = context.taskMetrics().inputMetrics
  28. existingMetrics.incBytesRead(blockResult.bytes)
  29. new InterruptibleIterator[T](context, blockResult.data.asInstanceOf[Iterator[T]]) {
  30. override def next(): T = {
  31. existingMetrics.incRecordsRead(1)
  32. delegate.next()
  33. }
  34. }
  35. } else {
  36. new InterruptibleIterator(context, blockResult.data.asInstanceOf[Iterator[T]])
  37. }
  38. case Right(iter) =>
  39. new InterruptibleIterator(context, iter.asInstanceOf[Iterator[T]])
  40. }
  41. }

RDD.iterator中会根据storageLevel有一个判断,一个是尝试从checkpoint中恢复或者计算,一个是从cache中get或计算,加了cache的rdd会执行RDD.getOrCompute,RDD.getOrCompute会调用BlockManager.getOrElseUpdate

org.apache.spark.storage.BlockManager

  1. /**
  2. * Retrieve the given block if it exists, otherwise call the provided `makeIterator` method
  3. * to compute the block, persist it, and return its values.
  4. *
  5. * @return either a BlockResult if the block was successfully cached, or an iterator if the block
  6. * could not be cached.
  7. */
  8. def getOrElseUpdate[T](
  9. blockId: BlockId,
  10. level: StorageLevel,
  11. classTag: ClassTag[T],
  12. makeIterator: () => Iterator[T]): Either[BlockResult, Iterator[T]] = {
  13. // Attempt to read the block from local or remote storage. If it's present, then we don't need
  14. // to go through the local-get-or-put path.
  15. get[T](blockId)(classTag) match {
  16. case Some(block) =>
  17. return Left(block)
  18. case _ =>
  19. // Need to compute the block.
  20. }
  21. // Initially we hold no locks on this block.
  22. doPutIterator(blockId, makeIterator, level, classTag, keepReadLock = true) match {
  23. case None =>
  24. // doPut() didn't hand work back to us, so the block already existed or was successfully
  25. // stored. Therefore, we now hold a read lock on the block.
  26. val blockResult = getLocalValues(blockId).getOrElse {
  27. // Since we held a read lock between the doPut() and get() calls, the block should not
  28. // have been evicted, so get() not returning the block indicates some internal error.
  29. releaseLock(blockId)
  30. throw new SparkException(s"get() failed for block $blockId even though we held a lock")
  31. }
  32. // We already hold a read lock on the block from the doPut() call and getLocalValues()
  33. // acquires the lock again, so we need to call releaseLock() here so that the net number
  34. // of lock acquisitions is 1 (since the caller will only call release() once).
  35. releaseLock(blockId)
  36. Left(blockResult)
  37. case Some(iter) =>
  38. // The put failed, likely because the data was too large to fit in memory and could not be
  39. // dropped to disk. Therefore, we need to pass the input iterator back to the caller so
  40. // that they can decide what to do with the values (e.g. process them without caching).
  41. Right(iter)
  42. }
  43. }

getOrElseUpdate.getOrElseUpdate首先尝试从cache中获取block,如果没有则调用doPutIterator计算并放到cache中;

org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:996)
org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:700)
org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)

所以jstack中的堆栈doPutIterator表明cache中没有,需要重新计算;

org.apache.spark.rdd.RDD

  1. /**
  2. * Persist this RDD with the default storage level (`MEMORY_ONLY`).
  3. */
  4. def cache(): this.type = persist()
  5.  
  6. /**
  7. * Persist this RDD with the default storage level (`MEMORY_ONLY`).
  8. */
  9. def persist(): this.type = persist(StorageLevel.MEMORY_ONLY)

cache使用的StorageLevel是MEMORY_ONLY,如果内存不够有些分区可能会被evict掉,具体策略在org.apache.spark.storage.memory.MemoryStore中

下面看StorageLevel:

org.apache.spark.storage.StorageLevel

  1. /**
  2. * :: DeveloperApi ::
  3. * Flags for controlling the storage of an RDD. Each StorageLevel records whether to use memory,
  4. * or ExternalBlockStore, whether to drop the RDD to disk if it falls out of memory or
  5. * ExternalBlockStore, whether to keep the data in memory in a serialized format, and whether
  6. * to replicate the RDD partitions on multiple nodes.
  7. *
  8. * The [[org.apache.spark.storage.StorageLevel$]] singleton object contains some static constants
  9. * for commonly useful storage levels. To create your own storage level object, use the
  10. * factory method of the singleton object (`StorageLevel(...)`).
  11. */
  12. @DeveloperApi
  13. class StorageLevel private(
  14. private var _useDisk: Boolean,
  15. private var _useMemory: Boolean,
  16. private var _useOffHeap: Boolean,
  17. private var _deserialized: Boolean,
  18. private var _replication: Int = 1)
  19. extends Externalizable {
  20. ...
  21.  
  22. object StorageLevel {
  23. val NONE = new StorageLevel(false, false, false, false)
  24. val DISK_ONLY = new StorageLevel(true, false, false, false)
  25. val DISK_ONLY_2 = new StorageLevel(true, false, false, false, 2)
  26. val MEMORY_ONLY = new StorageLevel(false, true, false, true)
  27. val MEMORY_ONLY_2 = new StorageLevel(false, true, false, true, 2)
  28. val MEMORY_ONLY_SER = new StorageLevel(false, true, false, false)
  29. val MEMORY_ONLY_SER_2 = new StorageLevel(false, true, false, false, 2)
  30. val MEMORY_AND_DISK = new StorageLevel(true, true, false, true)
  31. val MEMORY_AND_DISK_2 = new StorageLevel(true, true, false, true, 2)
  32. val MEMORY_AND_DISK_SER = new StorageLevel(true, true, false, false)
  33. val MEMORY_AND_DISK_SER_2 = new StorageLevel(true, true, false, false, 2)
  34. val OFF_HEAP = new StorageLevel(true, true, true, false, 1)

所以一些昂贵的操作之后不要以为Rdd.cache就可以避免重复计算,因为MEMORY_ONLY只是尽量帮你把数据缓存在内存,并不是一种保证,应该使用RDD.persist(StorageLevel.MEMORY_AND_DISK)

【原创】大叔问题定位分享(27)spark中rdd.cache的更多相关文章

  1. 【原创】大叔问题定位分享(11)Spark中对大表子查询加limit为什么会报Broadcast超时错误

    当两个表需要join时,如果一个是大表,一个是小表,正常的map-reduce流程需要shuffle,这会导致大表数据在节点间网络传输,常见的优化方式是将小表读到内存中并广播到大表处理,避免shuff ...

  2. 关于Spark中RDD的设计的一些分析

    RDD, Resilient Distributed Dataset,弹性分布式数据集, 是Spark的核心概念. 对于RDD的原理性的知识,可以参阅Resilient Distributed Dat ...

  3. spark中RDD的转化操作和行动操作

    本文主要是讲解spark里RDD的基础操作.RDD是spark特有的数据模型,谈到RDD就会提到什么弹性分布式数据集,什么有向无环图,本文暂时不去展开这些高深概念,在阅读本文时候,大家可以就把RDD当 ...

  4. Spark 中 RDD的运行机制

    1. RDD 的设计与运行原理 Spark 的核心是建立在统一的抽象 RDD 之上,基于 RDD 的转换和行动操作使得 Spark 的各个组件可以无缝进行集成,从而在同一个应用程序中完成大数据计算任务 ...

  5. spark中的cache和persist的区别

    在使用中一直知其然不知其所以然的地使用RDD.cache(),系统的学习之后发现还有一个与cache功能类似看起来冗余的persist 点进去一探究竟之后发现cache()是persist()的特例, ...

  6. 【原创】大叔问题定位分享(7)Spark任务中Job进度卡住不动

    Spark2.1.1 最近运行spark任务时会发现任务经常运行很久,具体job如下: Job Id  ▾ Description Submitted Duration Stages: Succeed ...

  7. 【原创】大叔问题定位分享(18)beeline连接spark thrift有时会卡住

    spark 2.1.1 beeline连接spark thrift之后,执行use database有时会卡住,而use database 在server端对应的是 setCurrentDatabas ...

  8. 【原创】大叔问题定位分享(16)spark写数据到hive外部表报错ClassCastException: org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to org.apache.hadoop.hive.ql.io.HiveOutputFormat

    spark 2.1.1 spark在写数据到hive外部表(底层数据在hbase中)时会报错 Caused by: java.lang.ClassCastException: org.apache.h ...

  9. 【原创】大叔问题定位分享(15)spark写parquet数据报错ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead

    spark 2.1.1 spark里执行sql报错 insert overwrite table test_parquet_table select * from dummy 报错如下: org.ap ...

随机推荐

  1. Tutorial 01_熟悉常用的Linux操作和Hadoop操作

    (一)熟悉常用的Linux 操作cd 命令:切换目录 (1) 切换到目录“/usr/local” (2) 切换到当前目录的上一级目录 (3) 切换到当前登录Linux 系统的用户的自己的主文件夹  ...

  2. HashTable和HashMap

    参考自: http://blog.csdn.net/tgxblue/article/details/8479147 HashMap和HashTable的异同: 1 HashMap是非线程安全的,Has ...

  3. golang函数

    一.函数语法 func 函数名(形参列表) (返回值列表){ ...... return 返回值 } 例如: package main import "fmt" func test ...

  4. CentOS_7升级系统内核

    最近,在虚拟机中安装docker成功之后,尝试运行docker run hello-world时出现以下错误: $ sudo docker run hello-world Unable to find ...

  5. 三、Java多人博客系统-技术架构

    多人博客系统1.0版本,架构和技术还是很简单和很传统的. 1.技术 前端:jsp.html.css.javascript.jquery.easyui.echarts 后端:spring mvc.Hib ...

  6. python之内置函数(二)与匿名函数、递归函数初识

    一.内置函数(二)1.和数据结构相关(24)列表和元祖(2)list:将一个可迭代对象转化成列表(如果是字典,默认将key作为列表的元素).tuple:将一个可迭代对象转化成元组(如果是字典,默认将k ...

  7. js对内容进行编码(富文本编辑器使用居多)

    escape(string)函数可对字符串进行编码,这样就可以在所有的计算机上读取该字符串. 使用unescape(string) 对 escape() 编码的字符串进行解码.

  8. 关于data()获取不到得原因

    ..原因很简单,版本高低问题  从jQuery 1.4.3起, HTML 5 data- 属性 将自动被引用到jQuery的数据对象中.  所以,还是尽量保持用attr来获取自定义属性

  9. JavaWeb之商品查看后历史记录代码实现

    JavaWeb之商品查看后历史记录代码实现全过程解析. 历史记录思路图: 假设已经访问了商品 :1-2-3 那么历史记录就是1-2-3,如果访问了商品8,那么历史记录就是:8-1-2-3,如果再次访问 ...

  10. Mysql注入小tips --持续更新中

    学习Web安全好几年了,接触最多的是Sql注入,一直最不熟悉的也是Sql注入.OWASP中,Sql注入危害绝对是Top1.花了一点时间研究了下Mysql类型的注入. 文章中的tips将会持续更新,先说 ...