http://192.168.2.51:4041

http://hadoop1:8088/proxy/application_1512362707596_0006/executors/

Executors

Summary

 
  RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Blacklisted
Active(3) 54 1.4 GB / 1.2 GB 700.1 MB 2 50 0 22 72 6.5 min (2 s) 0.0 B 0.0 B 0.0 B 0
Dead(0) 0 0.0 B / 0.0 B 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B 0
Total(3) 54 1.4 GB / 1.2 GB 700.1 MB 2 50 0 22 72 6.5 min (2 s) 0.0 B 0.0 B 0.0 B 0
 

Executors

Show 
20
40
60
100
All
 entries
Search:
Executor ID Address Status RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Logs Thread Dump
driver 192.168.2.51:52491 Active 2 5.7 KB / 384.1 MB 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B   Thread Dump
2 hadoop2:33018 Active 26 729.5 MB / 384.1 MB 348.1 MB 1 25 0 11 36 2.6 min (1 s) 0.0 B 0.0 B 0.0 B Thread Dump
1 hadoop1:53695 Active 26 700.1 MB / 384.1 MB 352 MB 1 25 0 11 36 3.9 min (0.9 s) 0.0 B 0.0 B 0.0 B Thread Dump
from pyspark.sql import SparkSession

my_spark = SparkSession \
.builder \
.appName("myAppYarn-10g") \
.master('yarn') \
.config("spark.mongodb.input.uri", "mongodb://pyspark_admin:admin123@192.168.2.50/recommendation.article") \
.config("spark.mongodb.output.uri", "mongodb://pyspark_admin:admin123@192.168.2.50/recommendation.article") \
.getOrCreate() db_rows = my_spark.read.format("com.mongodb.spark.sql.DefaultSource").load().collect()

Summary

 
  RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Blacklisted
Active(3) 31 748.4 MB / 1.2 GB 75.7 MB 2 27 0 0 27 0 ms (0 ms) 0.0 B 0.0 B 0.0 B 0
Dead(2) 56 1.5 GB / 768.2 MB 790.3 MB 2 0 0 77 77 2.7 h (2 s) 0.0 B 0.0 B 0.0 B 0
Total(5) 87 2.3 GB / 1.9 GB 865.9 MB 4 27 0 77 104 2.7 h (2 s) 0.0 B 0.0 B 0.0 B 0
 

Executors

Show 
20
40
60
100
All
 entries
Search:
Executor ID Address Status RDD Blocks Storage Memory Disk Used Cores Active Tasks Failed Tasks Complete Tasks Total Tasks Task Time (GC Time) Input Shuffle Read Shuffle Write Logs Thread Dump
driver 192.168.2.51:52491 Active 2 5.7 KB / 384.1 MB 0.0 B 0 0 0 0 0 0 ms (0 ms) 0.0 B 0.0 B 0.0 B   Thread Dump
4 hadoop2:34394 Active 12 315.9 MB / 384.1 MB 0.0 B 1 11 0 0 11 0 ms (0 ms) 0.0 B 0.0 B 0.0 B Thread Dump
3 hadoop1:39620 Active 17 432.5 MB / 384.1 MB 75.7 MB 1 16 0 0 16 0 ms (0 ms) 0.0 B 0.0 B 0.0 B Thread Dump
2 hadoop2:33018 Dead 27 758.7 MB / 384.1 MB 390.4 MB 1 0 0 38 38 1.3 h (1 s) 0.0 B 0.0 B 0.0 B Thread Dump
1 hadoop1:53695 Dead 29 775.9 MB / 384.1 MB 399.9 MB 1 0 0 39 39 1.4 h (0.9 s) 0.0 B 0.0 B 0.0 B Thread Dump
Showing 1 to 5 of 5 entries
 
 
Logs for container_1512362707596_0006_02_000002 http://hadoop1:8042/node/containerlogs/container_1512362707596_0006_02_000002/root/stderr?start=-4096
 
 
 
 

Logs for container_1512362707596_0006_02_000002

 

ResourceManager

NodeManager

Tools

Showing 4096 bytes. Click here for full log

Manager: Dropping block taskresult_48 from memory
17/12/04 13:14:32 INFO storage.BlockManager: Writing block taskresult_48 to disk
17/12/04 13:14:32 INFO memory.MemoryStore: After dropping 1 blocks, free memory is 38.5 MB
17/12/04 13:14:32 INFO memory.MemoryStore: Block taskresult_73 stored as bytes in memory (estimated size 32.5 MB, free 6.1 MB)
17/12/04 13:14:32 INFO executor.Executor: Finished task 72.0 in stage 1.0 (TID 73). 34033291 bytes result sent via BlockManager)
17/12/04 13:14:32 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 74
17/12/04 13:14:32 INFO executor.Executor: Running task 73.0 in stage 1.0 (TID 74)
17/12/04 13:14:38 INFO memory.MemoryStore: 1 blocks selected for dropping (16.0 MB bytes)
17/12/04 13:14:38 INFO storage.BlockManager: Dropping block taskresult_50 from memory
17/12/04 13:14:38 INFO storage.BlockManager: Writing block taskresult_50 to disk
17/12/04 13:14:38 INFO memory.MemoryStore: After dropping 1 blocks, free memory is 22.1 MB
17/12/04 13:14:38 INFO memory.MemoryStore: Block taskresult_74 stored as bytes in memory (estimated size 14.4 MB, free 7.7 MB)
17/12/04 13:14:38 INFO executor.Executor: Finished task 73.0 in stage 1.0 (TID 74). 15083225 bytes result sent via BlockManager)
17/12/04 13:14:38 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 75
17/12/04 13:14:38 INFO executor.Executor: Running task 74.0 in stage 1.0 (TID 75)
17/12/04 13:14:46 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.2 KB, free 7.7 MB)
17/12/04 13:14:46 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 433.0 B, free 7.7 MB)
17/12/04 13:14:48 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
17/12/04 13:14:48 ERROR executor.Executor: Exception in task 74.0 in stage 1.0 (TID 75)
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
at org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41)
at java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1853)
at java.io.ObjectOutputStream.write(ObjectOutputStream.java:709)
at org.apache.spark.util.Utils$.writeByteBuffer(Utils.scala:239)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply$mcV$sp(TaskResult.scala:50)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply(TaskResult.scala:48)
at org.apache.spark.scheduler.DirectTaskResult$$anonfun$writeExternal$1.apply(TaskResult.scala:48)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.scheduler.DirectTaskResult.writeExternal(TaskResult.scala:48)
at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:403)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
17/12/04 13:14:48 INFO connection.MongoClientCache: Closing MongoClient: [192.168.2.50:27017]
17/12/04 13:14:48 INFO driver.connection: Closed connection [connectionId{localValue:4, serverValue:42}] to 192.168.2.50:27017 because the pool has been closed.
 
 
 

spark 33G表的更多相关文章

  1. 基于spark实现表的join操作

    1. 自连接 假设存在如下文件: [root@bluejoe0 ~]# cat categories.csv 1,生活用品,0 2,数码用品,1 3,手机,2 4,华为Mate7,3 每一行的格式为: ...

  2. 利用spark将表中数据拆分

    i# coding:utf-8from pyspark.sql import SparkSession import os if __name__ == '__main__': os.environ[ ...

  3. spark使用Hive表操作

    spark Hive表操作 之前很长一段时间是通过hiveServer操作Hive表的,一旦hiveServer宕掉就无法进行操作. 比如说一个修改表分区的操作 一.使用HiveServer的方式 v ...

  4. Databricks 第6篇:Spark SQL 维护数据库和表

    Spark SQL 表的命名方式是db_name.table_name,只有数据库名称和数据表名称.如果没有指定db_name而直接引用table_name,实际上是引用default 数据库下的表. ...

  5. Spark SQL概念学习系列之如何使用 Spark SQL(六)

    val sqlContext = new org.apache.spark.sql.SQLContext(sc) // 在这里引入 sqlContext 下所有的方法就可以直接用 sql 方法进行查询 ...

  6. spark基础知识介绍2

    dataframe以RDD为基础的分布式数据集,与RDD的区别是,带有Schema元数据,即DF所表示的二维表数据集的每一列带有名称和类型,好处:精简代码:提升执行效率:减少数据读取; 如果不配置sp ...

  7. 新手福利:Apache Spark入门攻略

    [编者按]时至今日,Spark已成为大数据领域最火的一个开源项目,具备高性能.易于使用等特性.然而作为一个年轻的开源项目,其使用上存在的挑战亦不可为不大,这里为大家分享SciSpike软件架构师Ash ...

  8. Spark入门之DataFrame/DataSet

    目录 Part I. Gentle Overview of Big Data and Spark Overview 1.基本架构 2.基本概念 3.例子(可跳过) Spark工具箱 1.Dataset ...

  9. 6.3 使用Spark SQL读写数据库

    Spark SQL可以支持Parquet.JSON.Hive等数据源,并且可以通过JDBC连接外部数据源 一.通过JDBC连接数据库 1.准备工作 ubuntu安装mysql教程 在Linux中启动M ...

随机推荐

  1. 01-封装函数求斐波那契数列第n项

    <!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8&quo ...

  2. Tomcat基础配置(一)

    详情请看散尽浮华的tomcat相关配置技巧梳理 本次只用于自己的查看,谢谢作者的谅解. tomcat常用架构:1)nginx+tomcat:即前端放一台nginx,然后通过nginx反向代理到tomc ...

  3. redis主从原理介绍(三)

    博客参考:散尽浮华的Redis主从复制下的工作原理梳理 此作者写的非常好,此处只做挪用,方便自己查看. Redis主从复制的配置十分简单,它可以使从服务器是主服务器的完全拷贝.需要清除Redis主从复 ...

  4. Spoj-ANTP Mr. Ant & His Problem

    Mr. Ant has 3 boxes and the infinite number of marbles. Now he wants to know the number of ways he c ...

  5. 关于Boot应用中集成Spring Security你必须了解的那些事

    Spring Security Spring Security是Spring社区的一个顶级项目,也是Spring Boot官方推荐使用的Security框架.除了常规的Authentication和A ...

  6. ZOJ 3306 状压dp

    转自:http://blog.csdn.net/a497406594/article/details/38442893 Kill the Monsters Time Limit: 7 Seconds ...

  7. IntelliJ IDEA出现:This file is indented with tabs instead of 4 spaces的问题解决

    根据阿里巴巴Java开发手册,不能使用Tab字符,改成4个字符,设置如下: 注意:是不选择! 一定要选择这个:

  8. 7.Java web—tomcat9部署

    1)安装 在此之前要安装 好jdk和jre 下载绿色版 http://tomcat.apache.org/ 解压至:D:\Program Files (x86)\tomcat9 环境变更path添加两 ...

  9. Object中的wait,notify,notifyAll基本使用(转)

    让线程停止运行/睡眠的方法只有两个:Thread.sleep()或者obj.wait() 记住obj.nofity()并不能停止线程运行,因为notify虽然释放了锁,但依然会急促执行完synchro ...

  10. 用 jQuery实现图片等比例缩放大小

    原文:http://www.open-open.com/code/view/1420975773093 <script type="text/javascript"> ...