问题现象:

2019-03-11 12:30:52,174 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7653ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7692ms
2019-03-11 12:31:00,573 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7899ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7951ms
2019-03-11 12:31:08,952 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7878ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7937ms
2019-03-11 12:31:17,405 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7951ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8037ms
2019-03-11 12:31:26,611 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8705ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8835ms
2019-03-11 12:31:35,009 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7897ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8083ms
2019-03-11 12:31:43,806 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8296ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8416ms
2019-03-11 12:31:52,317 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8010ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8163ms
2019-03-11 12:32:00,680 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7862ms

  

gc一段时间后出现:

2019-03-11 12:27:15,820 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.OutOfMemoryError: Java heap space
at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300)
at java.lang.StringCoding.encode(StringCoding.java:344)
at java.lang.String.getBytes(String.java:918)
at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
at java.io.File.exists(File.java:819)
at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1282)
at sun.misc.URLClassPath.getResource(URLClassPath.java:239)
at java.net.URLClassLoader$1.run(URLClassLoader.java:365)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.close(JournalSet.java:244)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.close(FSEditLog.java:400)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.close(FSEditLogAsync.java:112)
at org.apache.hadoop.hdfs.server.namenode.FSImage.close(FSImage.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1079)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:666)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:728)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:932)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1741)
2019-03-11 12:27:15,827 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.OutOfMemoryError: Java heap space
2019-03-11 12:27:15,830 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

  

或者出现下面的错误:

2019-03-11 11:09:16,124 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.google.protobuf.CodedInputStream.<init>(CodedInputStream.java:573)
at com.google.protobuf.CodedInputStream.newInstance(CodedInputStream.java:55)
at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:199)
at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeSection$INode.parseDelimitedFrom(FsImageProto.java:10867)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:233)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:250)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:176)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:937)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:921)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:794)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:724)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:322)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1052)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:666)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:728)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:932)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1741)
2019-03-11 11:09:16,127 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.OutOfMemoryError: GC overhead limit exceeded

  

解决:

打开hadoop-env.sh文件,找到HADOOP_HEAPSIZE=  和HADOOP_NAMENODE_INIT_HEAPSIZE=  调整这两个参数,具体调整多少,视情况而定,默认是1000m,也就是一个g,我这里调整如下:

export HADOOP_HEAPSIZE=32000
export HADOOP_NAMENODE_INIT_HEAPSIZE=16000 这两个参数去掉前面的#号,两台namenode节点都要调整

  

接着重新启动hdfs,如果还不行,打开hadoop-env.sh文件,找到HADOOP_NAMENODE_OPTS

export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender}  $HADOOP_NAMENODE_OPTS"    ----这是系统默认值
调整如下:
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} -Xms6000m -Xmx6000m -XX:+UseCompressedOops -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:SoftRefLRUPolicyMSPerMB=0 $HADOOP_NAMENODE_OPTS"

  

接着重新启动hdfs,如果还是报上面的错误,那就继续调大上面

HADOOP_HEAPSIZE和
HADOOP_NAMENODE_INIT_HEAPSIZE  的值

重启hdfs集群的时候,报大量的gc问题。的更多相关文章

  1. vivo 万台规模 HDFS 集群升级 HDFS 3.x 实践

    vivo 互联网大数据团队-Lv Jia Hadoop 3.x的第一个稳定版本在2017年底就已经发布了,有很多重大的改进. 在HDFS方面,支持了Erasure Coding.More than 2 ...

  2. 大数据学习之hdfs集群安装部署04

    1-> 集群的准备工作 1)关闭防火墙(进行远程连接) systemctl stop firewalld systemctl -disable firewalld 2)永久修改设置主机名 vi ...

  3. HDFS集群常见报错汇总

    HDFS集群常见报错汇总 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.DataXceiver error processing WRITE_BLOCK operation 报 ...

  4. 大数据学习笔记03-HDFS-HDFS组件介绍及Java访问HDFS集群

    HDFS组件概述 NameNode 存储数据节点信息及元文件,即:分成了多少数据块,每一个数据块存储在哪一个DataNode中,每一个数据块备份到哪些DataNode中 这个集群有哪些DataNode ...

  5. 马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作

    马士兵hadoop第一课:虚拟机搭建和安装hadoop及启动 马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作 马士兵hadoop第三课:java开发hdfs 马士兵hadoop第 ...

  6. 马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作(转)

    马士兵hadoop第一课:虚拟机搭建和安装hadoop及启动 马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作 马士兵hadoop第三课:java开发hdfs 马士兵hadoop第 ...

  7. 大数据(2)---HDFS集群搭建

    一.准备工作 1.准备几台机器,我这里使用VMware准备了四台机器,一个name node,三个data node. VMware安装虚拟机:https://www.cnblogs.com/niju ...

  8. HDFS集群balance(2)-- 架构概览

    转载请注明博客地址:http://blog.csdn.net/suileisl HDFS集群balance,对应版本balance design 6 如需word版本,请QQ522173163联系索要 ...

  9. HDFS集群balance(3)-- 架构细节

    转载请注明博客地址:http://blog.csdn.net/suileisl HDFS集群balance,对应版本balance design 6 如需word版本,请QQ522173163联系索要 ...

随机推荐

  1. jQueryUI的widget的Hello World

    为了看懂jQuery-File-Upload里面的代码,所以学习到这里 //main.js //实践自定义jquery widget,风格1 (function($){ //$.widget('命名空 ...

  2. 数据结构与算法--递归(recursion)

    递归的概念 简单的说: 递归就是方法自己调用自己,每次调用时传入不同的变量.递归有助于编程者解决复杂的问题,同时可以让代码变得简洁. 递归调用机制 我列举两个小案例,来帮助大家理解递归 1.打印问题 ...

  3. logback配置文件模板

    <?xml version="1.0" encoding="UTF-8"?> <configuration debug="false ...

  4. arm-none-eabi/bin/ld: build/com.zubax.gnss.elf section `.text' will not fit in region `flash'

    出现如下错误: /arm-none-eabi/bin/ld: build/com.zubax.gnss.elf section `.text' will not fit in region `flas ...

  5. Java软件编码习惯

    1.再删除某个类时候,一定别忘记把对应的import也删除掉: 可以手动删除,也可以 Ctrl+Shift+O快捷键自动删除和导入.

  6. 【SpringMVC】RESTful支持

    一.概述 1.1 什么是RESTful 1.2 URL的RESTful实现 二.演示 2.1 需求 2.2 第一步更改DispatcherServlet配置 2.3 第二步参数通过url传递 2.4 ...

  7. 解决用Xftp向虚拟机VMware传文件速度慢的问题

    在使用Xftp向虚拟机传文件时发现很慢,之后几K,如果这个文件有几十M,这是一个非常让人头疼的问题.网上找过很多设置都试过,都没有效果,偶然发现Windows网络配置中,有个选项Large Send ...

  8. Django session默认配置

    配置 settings.py     SESSION_ENGINE = 'django.contrib.sessions.backends.db'   # 引擎(默认)           SESSI ...

  9. C#一些不太熟悉的类——扩展学习

    Process.CloseMainWindow Method 通过向进程的主窗口发送关闭消息来关闭拥有用户界面的进程. 注解 进程执行时,其消息循环处于等待状态. 每次操作系统将 Windows 消息 ...

  10. STM32 LoRaWAN探索板B-L072Z-LRWAN1入门指南

    UM2159用户手册 基于STM32L0的超低功耗LoRa探索套件入门指南 前言 LoRa 探索套件(B-L072Z-LRWAN1)是一款RF探索开发板,采用了Murata公司的LoRa模块CMWX1 ...