换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes
同事换网线前我将节点2正常关闭了,换完网线告诉我,发现节点2死活起不来了,看上面的日志和一些帖子最后也没解决,尝试过重启、网线拔掉重新插上、查看过存储是否正常和存储重新挂载。。。。看过一个帖子说可能是OCR信息发生了改变,不过之前没备份,也没忘这方面深入考虑。
最后还是没搞定,主要是技术有限,没准确的定位出具体问题也不敢轻易乱动。。。
20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes
20xx-12-16 19:01:06.295: [GIPCHALO][3811858176] gipchaLowerProcessNode: no valid interfaces found to node for 7286464 ms, node 0x7fecd0028450 { host 'myrac1', haName 'CSS_myrac-cluster', srcLuid fac66ea4-f1a960af, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [249 : 249], createTime 7037424, sentRegister 1, localMonitor 1, flags 0x4 }
20xx-12-16 19:01:06.303: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:06.420: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618800, LATS 7286584, lastSeqNo 211618797, uniqueness 1576485880, timestamp 1576494065/8540734
20xx-12-16 19:01:06.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618802, LATS 7286594, lastSeqNo 211618799, uniqueness 1576485880, timestamp 1576494066/8541524
20xx-12-16 19:01:07.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:07.421: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618803, LATS 7287584, lastSeqNo 211618800, uniqueness 1576485880, timestamp 1576494066/8541734
20xx-12-16 19:01:07.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618805, LATS 7287604, lastSeqNo 211618802, uniqueness 1576485880, timestamp 1576494067/8542524
20xx-12-16 19:01:08.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:08.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618806, LATS 7288584, lastSeqNo 211618803, uniqueness 1576485880, timestamp 1576494067/8542734
20xx-12-16 19:01:08.436: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618808, LATS 7288604, lastSeqNo 211618805, uniqueness 1576485880, timestamp 1576494068/8543524
20xx-12-16 19:01:09.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:09.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618809, LATS 7289584, lastSeqNo 211618806, uniqueness 1576485880, timestamp 1576494068/8543744
20xx-12-16 19:01:09.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618811, LATS 7289604, lastSeqNo 211618808, uniqueness 1576485880, timestamp 1576494069/8544524
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmRcfgMgrThread: Local Join
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: begin on node(2), waittime 193000
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: set curtime (7289964) for my node
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: scanning 32 nodes
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: Node myrac1, number 1, is in an existing cluster with disk state 3
20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
20xx-12-16 19:01:10.305: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
20xx-12-16 19:01:10.423: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618812, LATS 7290584, lastSeqNo 211618809, uniqueness 1576485880, timestamp 1576494069/8544744
20xx-12-16 19:01:10.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618814, LATS 7290604, lastSeqNo 211618811, uniqueness 1576485880, timestamp 1576494070/8545524
20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes

20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118), status(0), sendresp(1)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118) msgseq(119), lastupdt<0x7fbb58031e10>, ignoreseq(0)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(1/1)
20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119), status(0), sendresp(1)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119) msgseq(120), lastupdt<0x7fbb5804d490>, ignoreseq(0)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), private data(2052), incarn(40)
20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120), status(0), sendresp(1)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120) msgseq(121), lastupdt<0x7fbb5803dee0>, ignoreseq(0)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(-1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(121), status(0), sendresp(1)
20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sent 5 status msgs to all nodes
20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sent 4 status msgs to all nodes
20xx-12-16 20:36:14.066: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
...

根据日志能判断出bond信息变了吗?我当时没发现也没分析出来,最后同事说改了bond!当时不是说只换根网线重新排下线吗?我说改回去试试,果然如此,重启一切正常了

胡乱重启了下,没起来。。。
[root@myrac2 bin]# ./crsctl query crs activeversion
Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

[root@myrac2 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
ORA-15077: could not locate ASM instance serving a required diskgroup

[grid@myrac2 ~]$ cd /u01/app/11.2.0/grid/bin/
[grid@myrac2 bin]$ srvctl start nodeapps -n myrac2
PRCR-1070 : Failed to check if resource ora.gsd is registered
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.net1.network is registered
Cannot communicate with crsd
PRCR-1035 : Failed to look up CRS resource myrac2 for ora.cluster_vip.type
PRCR-1068 : Failed to query resources
Cannot communicate with crsd
PRCR-1070 : Failed to check if resource ora.ons is registered
Cannot communicate with crsd

[grid@myrac2 bin]$ srvctl start asm -n myrac2
PRCR-1070 : Failed to check if resource ora.asm is registered
Cannot communicate with crsd

[grid@myrac2 bin]$ srvctl start database -d testdb2
PRCD-1027 : Failed to retrieve database testdb2
PRCR-1115 : Failed to find entities of type resource that match filters ((NAME == ora.testdb2.db) && (TYPE == ora.database.type)) and contain attributes VERSION,ORACLE_HOME,DATABASE_TYPE
Cannot communicate with crsd
[grid@myrac2 bin]$

节点2被修改的bond,明显跟1不一样
[root@myrac2 11.2.0]# service network status
Configured devices:
lo bond0 bond1 em1 em2 em3 em4
Currently active devices:
lo em1 em2 em3 em4 bond0 bond1
[root@myrac2 11.2.0]#

节点1
[root@myrac1 ~]# service network status
Configured devices:
lo bond0 em1 em2 em3 em4 idrac
Currently active devices:
lo em1 em2 em3 bond0

抛开技术行不行先不说,单这件事来说,同事之间的合作有时候更重要。一不小心你就会给别人挖个坑或掉到别人给你挖的坑

换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes的更多相关文章

  1. 异常System.Web.HttpException (0x80004005): Server cannot set status after HTTP headers have been sent.

    在用mvc 的AuthorizeAttribute做身份验证,重写HandleUnauthorizedRequest方法,在Application_Error方法里出现异常System.Web.Htt ...

  2. Linux异常关机后,Mysql启动出错ERROR 2002 (HY000)

    Linux异常关机后,Mysql启动或訪问时,出错: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/ ...

  3. AIX下禁止crs随ha启动而启动

    /etc/init.crs enable /etc/init.crs disable 查看目前crs是enable还是disable状态 状态记录在一个文本文件里  /etc/oracle/scls_ ...

  4. 异常-CDH的service无法启动并抛出异常-org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)

    1 详细异常 org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connectio ...

  5. WPF App.xaml.cs常用模板,包括:异常捕获,App只能启动一次

    App.xaml.cs中的代码每次都差不多,故特地将其整理出来直接复用: using System; using System.Configuration; using System.Diagnost ...

  6. MyEclipse异常关闭导致Tomcat不能启动的问题

    由于MyEclipse的异常关闭从而导致Tomcat并没有关闭,所以再次启动Tomcat当然是无法启动的啦,解决方法:在任务管理器中关闭一个叫javaw.exe的进程,如果你这时已经启动了MyEcli ...

  7. 左右RAC CRS 自己主动启动

    左右CRS自己主动重新启动实验 一.检验ASM [root@rac1 ~]# /etc/init.d/oracleasm status Checking if ASM is loaded: yes C ...

  8. 联想ideapad 310s如何进BIOS,换固态硬盘SSD,配置U盘启动,重装Win10系统

    1. 如何进BIOS 关机情况下,捅一下Novo键,即可进入BIOS 2. 安装固态硬盘 Ideadpad 310S 本身自带的硬盘是5400转的机械硬盘,容量小速度慢.换的新的固态硬盘是SATA接口 ...

  9. 换了XCode版本之后,iOS应用启动时不占满全屏,上下有黑边

    原因是没有Retina4对应的启动图片,解决方法很简单,就是把Retina4对应的图片给补上就只可以了

随机推荐

  1. ganglia 客户端部署

    #!/bin/bash #配置参数 #serverIP=192.168.1.16 #network=ens32 #关闭selinux #setenforce #sed -i 's/SELINUX=en ...

  2. 【JAVA - 基础】之String存储机制浅析

    本文主要解决以下几个问题 String源码解析? String和new String的区别? String通过"+"或concat累加时的对象创建机制? StringBuilder ...

  3. django-formset实现数据表的批量操作

    什么是formset 我们知道forms组件是用来做表单验证,更准确一点说,forms组件是用来做数据库表中一行记录的验证.有forms组件不同,formset是同科同时验证表中的多行记录,即form ...

  4. map.entrySet().iterator()

    1.首先创建一个HashMap, Map map= new HashMap(); 2.Iterator iter= map.entrySet().iterator(); 首先是map.entrySet ...

  5. 实战!轻松搭建图像分类 AI 服务

    人工智能技术(以下称 AI)是人类优秀的发现和创造之一,它代表着至少几十年的未来.在传统的编程中,工程师将自己的想法和业务变成代码,计算机会根据代码设定的逻辑运行.与之不同的是,AI 使计算机有了「属 ...

  6. Pycharm常见快捷键

    Ctrl+/注释(取消注释)选择的行 Shift + Enter开始新行 Ctrl + Enter智能换行 TAB Shift+TAB缩进/取消缩进所选择的行 Ctrl + Alt + I自动缩进行 ...

  7. 15.Django基础十一之认证系统

    一 auth模块 我们在开发一个网站的时候,无可避免的需要设计实现网站的用户系统.此时我们需要实现包括用户注册.用户登录.用户认证.注销.修改密码等功能,这还真是个麻烦的事情呢. Django作为一个 ...

  8. mq解决分布式事物问题

    今天只看看原理,下一节看项目怎么集成mq进行解决分布式事物. 1.什么情况下会使用到分布式事物? 举例说明:现有一个支付系统,因为项目使用的是微服务框架,有订单模块和支付模块两个模块.生产者进行订单的 ...

  9. acm博弈论基础总结

    acm博弈论基础总结 常见博弈结论 Nim 问题:共有N堆石子,编号1..n,第i堆中有个a[i]个石子. 每一次操作Alice和Bob可以从任意一堆石子中取出任意数量的石子,至少取一颗,至多取出这一 ...

  10. iSensor APP 之 摄像头调试 OV9655

    iSensor APP 之 摄像头调试  OV9655 iSensor app 非常适合调试各种摄像头,已测试通过的sensor有: l  OV7670.OV7725.OV9650.OV9655.OV ...