在ambari-server中修改了yarn的配置,重新启动服务,结果RM启动失败,错误也很奇怪,“不合理的资源请求,没有请求任何资源”!详细如下:

-- ::, FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1495)) - Error starting ResourceManager
org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:)
Caused by: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
... more
-- ::, INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x36546c044dc0113 closed
-- ::, INFO zookeeper.ClientCnxn (ClientCnxn.java:run()) - EventThread shut down
-- ::, INFO resourcemanager.ResourceManager (LogAdapter.java:info()) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at ep-bd01/192.168.58.11

网上多方搜索无解,最后无奈重新启动主机,重启所有服务,结果成功! 再次重启RM,失败,原因同上。

一、配置RM HA,这次启动了,但是配置的两个RM节点都是standby状态! 期间再次修改配置文件无数次,无效,错误信息依然。

二、手工激活一台主机上的RM,失败,错误原因相同

[root@ep-bd01 zookeeper]# yarn rmadmin -transitionToActive --forceactive --forcemanual rm1
You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state. You may abort safely by answering 'n' or hitting ^C now. Are you sure you want to continue? (Y or N) y
......
......
// :: WARN ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:)
... more
Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, no resources requested
at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$.run(ResourceManager.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:)
... more

At Last! 经过好几天的网上搜索以及思考,这个错误可能是HDP3.0的新错误信息,和网上搜索到的一个问题有些类似,现象同样是RM启动成功后马上挂掉! 其中提到可能是RM回复application的状态引起的故障,急忙实验一下。

简而言之,使用zookeeper命令删除 /rmstore/ZKRMStateRoot/RMAppRoot 下面的所有子目录。

然后重启RM,没想到困扰几天的问题就这么解决了,具体请看输出吧(容我乐一会儿先)。

[root@ep-bd03 pg_log]# sudo -u zookeeper /usr/hdp/3.0.0.0-1634/zookeeper/bin/zkCli.sh

Connecting to localhost:
-- ::, - INFO [main:Environment@] - Client environment:zookeeper.version=3.4.---, built on // : GMT
-- ::, - INFO [main:Environment@] - Client environment:host.name=ep-bd03
-- ::, - INFO [main:Environment@] - Client environment:java.version=1.8.0_181
-- ::, - INFO [main:Environment@] - Client environment:java.vendor=Oracle Corporation
-- ::, - INFO [main:Environment@] - Client environment:java.home=/usr/java/jdk1..0_181-amd64/jre
-- ::, - INFO [main:Environment@] - Client environment:java.class.path=/usr/hdp/3.0.0.0-/zookeeper/bin/../build/classes:/usr/hdp/3.0.0.0-/zookeeper/bin/../build/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/xercesMinimal-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-provider-api-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared4-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-shared-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-lightweight-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-http-2.4.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/wagon-file-1.0-beta-6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-utils-3.0.8.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-interpolation-1.11.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/plexus-container-default-1.0-alpha-9-stable-1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/nekohtml-1.9.6.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-settings-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-repository-metadata-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-project-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-profile-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-plugin-registry-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-model-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-error-diagnostics-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-manager-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-artifact-2.2.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/maven-ant-tasks-2.1.3.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jsoup-1.7.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-logging-1.1.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-io-2.2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/commons-codec-1.6.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/classworlds-1.1-alpha-2.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/backport-util-concurrent-3.1.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-launcher-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../lib/ant-1.8.0.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../zookeeper-3.4.6.3.0.0.0-1634.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../src/java/lib/*.jar:/usr/hdp/3.0.0.0-1634/zookeeper/bin/../conf::/usr/share/zookeeper/*
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:java.compiler=<NA>
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.name=Linux
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.arch=amd64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:os.version=3.10.0-862.6.3.el7.x86_64
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.name=zookeeper
2018-08-29 15:04:02,399 - INFO [main:Environment@100] - Client environment:user.home=/var/lib/zookeeper
2018-08-29 15:04:02,400 - INFO [main:Environment@100] - Client environment:user.dir=/tmp/hsperfdata_zookeeper
2018-08-29 15:04:02,401 - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@6438a396
Welcome to ZooKeeper!
2018-08-29 15:04:02,417 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1019] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2018-08-29 15:04:02,461 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@864] - Socket connection established, initiating session, client: /127.0.0.1:7637, server: localhost/127.0.0.1:2181
[zk: localhost:2181(CONNECTING) 0] 2018-08-29 15:04:02,484 - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1279] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x3658450e5f202da, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls /rmstore
[ZKRMStateRoot]
[zk: localhost:2181(CONNECTED) 1] ls /rmstore/ZKRMStateRoot
[ReservationSystemRoot, RMAppRoot, AMRMTokenSecretManagerRoot, EpochNode, RMDTSecretManagerRoot, RMVersionNode]

[zk: localhost:2181(CONNECTED) 6] ls /rmstore/ZKRMStateRoot/RMAppRoot
[application_1534904073745_0001, HIERARCHIES, application_1534904073745_0003, application_1534904073745_0002]

[zk: localhost:2181(CONNECTED) 3] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0001
[zk: localhost:2181(CONNECTED) 4] rmr /rmstore/ZKRMStateRoot/RMAppRoot/HIERARCHIES
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0003
[zk: localhost:2181(CONNECTED) 5] rmr /rmstore/ZKRMStateRoot/RMAppRoot/application_1534904073745_0002

[zk: localhost:2181(CONNECTED) 7] ls /rmstore/ZKRMStateRoot/RMAppRoot
[]
[zk: localhost:2181(CONNECTED) 8] 

YARN 启动后失败退出——没有请求资源——Invalid resource request, no resources request的更多相关文章

  1. springboot启动后自动退出

    有时新建的springboot启动后自动退出运行,如图所示: 此种情况大都数是因为pom文件加入了tomcat的依赖,与springboot内嵌的tomcat冲突导致,所以只需将pom文件中的tomc ...

  2. docker 容器启动后立马退出的解决方法

    原因: 容器同时只能管理一个进程,如果这个进程结束了容器就退出了,但是不表示容器只能运行一个进程(其他进程可在后台运行),但是要使容器不退出必须要有一个进程在前台执行.   解决方案: 启动脚本最后一 ...

  3. 记录一次OracleJDK开发的项目发部到Linux中使用OpenJDK启动后失败的错误的解决方案

    一.现象 基于JAVA SpringBoot2.0.4的项目,发部后项目发部后,放到OpenJDK环境中运行时,提示下列错误: 2019-10-22 10:03:55 [main] WARN  o.s ...

  4. 在nginx启动后,如果我们要操作nginx,要怎么做呢 别增加无谓的上下文切换 异步非阻塞的方式来处理请求 worker的个数为cpu的核数 红黑树

    nginx平台初探(100%) — Nginx开发从入门到精通 http://ten 众所周知,nginx性能高,而nginx的高性能与其架构是分不开的.那么nginx究竟是怎么样的呢?这一节我们先来 ...

  5. SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应

    1.SQL Server(MSSQLSERVER)启动失败,提示“请求失败或服务未及时响应. --------------------------- SQL Server 配置管理器 -------- ...

  6. 【技术贴】第二篇 :解决使用maven jetty启动后无法加载修改过后的静态资源

    之前写过第一篇:[技术贴]解决使用maven jetty启动后无法加载修改过后的静态资源 一直用着挺舒服的,直到今天,出现了又不能修改静态js,jsp等资源的现象.很是苦闷. 经过调错处理之后,发现是 ...

  7. Servlet访问路径的两种方式、Servlet生命周期特点、计算服务启动后的访问次数、Get请求、Post请求

    Servlet访问路径的两种方式: 1:注解 即在Servlet里写一个@WebServlet @WebServlet("/myServlet") 2:配置web.xml < ...

  8. Spark(四十九):Spark On YARN启动流程源码分析(一)

    引导: 该篇章主要讲解执行spark-submit.sh提交到将任务提交给Yarn阶段代码分析. spark-submit的入口函数 一般提交一个spark作业的方式采用spark-submit来提交 ...

  9. Spark On YARN启动流程源码分析(一)

    本文主要参考: a. https://www.cnblogs.com/yy3b2007com/p/10934090.html 0. 说明 a. 关于spark源码会不定期的更新与补充 b. 对于spa ...

随机推荐

  1. 单元测试-unittest

    一.简介 unittest单元测试框架可组织执行测试用例,并且提供了丰富的断言方法,判断测试用例是否通过,最终生成测试结果. 二.属性介绍 1.unittest模块的各个属性 unittest.Tes ...

  2. 【搜索】WAR大佬的SET @upcexam6201

    时间限制: 1 Sec 内存限制: 128 MB 题目描述 WAR大佬认为一个包含重复元素的集合认为是优美的,当且仅当集合中的元素的和等于他们的积. 求包含n个元素的优美的集合的个数. WAR大佬当然 ...

  3. CSS中margin边界叠加问题及解决方案

    你对CSS的margin边界叠加的概念是否了解,这里和大家分享一下,当一个元素出现在另一个元素上面时,第一个元素的底边界与第二个元素的顶边界发生叠加. CSS的margin边界叠加深度剖析 边界叠加简 ...

  4. Java的内存管理机制之内存区域划分

    各位,好久不见.先做个预告,由于最近主要在做Java服务端开发,最近一段时间会更新Java服务端开发相关的一些知识,包括但不限于一些读书笔记.框架的学习笔记.和最近一段时间的思考和沉淀.先从Java虚 ...

  5. 使用http load测试qps

    官网 http://acme.com/software/http_load/ 安装 wget http://acme.com/software/http_load/http_load-12mar200 ...

  6. 使用docker搭建gitlab版本控制系统

    1. GitLab 简介 GitLab 是一款基于 git 的开源代码仓库系统   GitLab 与著名的 GitHub 最大的区别就是:  允许我们搭建自己的 git 代码私有仓库,非常方便   2 ...

  7. 使用idea创建web项目

    一直使用的是eclipse,有一个项目开发用的是idea,我也尝试着熟悉一下idea,先来创建一个web项目吧 1.idea下载安装使用 官方下载地址:https://www.jetbrains.co ...

  8. H5调用本地摄像头[转]

    http://www.cnblogs.com/GoodPingGe/p/4726126.html <!DOCTYPE html><html><head lang=&quo ...

  9. JS获取当前日期、比较日期大小

    //获取当前时间,格式YYYY-MM-DD function getNowFormatDate() { var date = new Date(); var seperator1 = "-& ...

  10. 使用多个项目生成Xml文件来显示帮助文档

    终于到这了,我们首先将Product单独作为一个项目 WebAPI2PostMan.WebModel 并引用他,查看文档如下. 你会发现,你的注释也就是属性的描述没有了.打开App_Data/XmlD ...