Pre-11gR2: "crsctl check crs" command hangs at EVMD check (文档 ID 1578875.1)



APPLIES TO:



Oracle Database - Enterprise Edition - Version 10.2.0.3 to 11.1.0.7 [Release 10.2 to 11.1]

Information in this document applies to any platform.

SYMPTOMS



In a 2 node RAC environment, with 11.1.0.7 CRS, execution of the command "crsctl check crs" hangs at EVMD check only in Node 1



[oracle@srv03401 bin]$ ./crsctl check crs

Cluster Synchronization Services appears healthy

Cluster Ready Services appears healthy



From Node1, below is the output of strace for the command "crsctl check crs"



# strace -f -t -o /tmp/crschk.trc crsctl check crs



Content of the generated output file :/tmp/crschk.trc is as follows:  



28268 11:47:03 execve("./crsctl", ["./crsctl", "check", "crs"], [/* 23 vars */]) = 0

28268 11:47:03 brk(0)                   = 0x193d2000

28268 11:47:03 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b35b9436000

28268 11:47:03 uname({sys="Linux", node="srv03401.metra.com", ...}) = 0

28268 11:47:03 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

28268 11:47:03 open("/etc/ld.so.cache", O_RDONLY) = 3

28268 11:47:03 fstat(3, {st_mode=S_IFREG|0644, st_size=92563, ...}) = 0

28268 11:47:03 mmap(NULL, 92563, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b35b9437000

28268 11:47:03 close(3)                 = 0

28268 11:47:03 open("/lib64/libtermcap.so.2", O_RDONLY) = 3

28268 11:47:03 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\20\300z2\0\0\0"..., 832) = 832

28268 11:47:03 fstat(3, {st_mode=S_IFREG|0755, st_size=15840, ...}) = 0

28268 11:47:03 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b35b944e000

28268 11:47:03 mmap(0x327ac00000, 2108944, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x327ac00000

28268 11:47:03 mprotect(0x327ac03000, 2093056, PROT_NONE) = 0

28268 11:47:03 mmap(0x327ae02000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x327ae02000

28268 11:47:03 close(3)                 = 0

28268 11:47:03 open("/lib64/libdl.so.2", O_RDONLY) = 3

..

..

28268 11:47:03 close(3)                 = 0

28268 11:47:03 write(1, "Cluster Ready Services appears h"..., 39) = 39

28268 11:47:03 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3

28268 11:47:03 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

28268 11:47:03 bind(3, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0

28268 11:47:03 getsockname(3, {sa_family=AF_INET6, sin6_port=htons(42027), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [140733193388060]) = 0

28268 11:47:03 getpeername(3, 0x7fff5f19e1e0, [140733193388060]) = -1 ENOTCONN (Transport endpoint is not connected)

28268 11:47:03 getsockopt(3, SOL_SOCKET, SO_SNDBUF, [5536382933839118336], [4]) = 0

28268 11:47:03 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [5536382933843050496], [4]) = 0

28268 11:47:03 fcntl(3, F_SETFD, FD_CLOEXEC) = 0

28268 11:47:03 fcntl(3, F_SETFL, O_RDONLY|O_NONBLOCK) = 0

28268 11:47:03 geteuid()                = 700

28268 11:47:03 times({tms_utime=1, tms_stime=2, tms_cutime=0, tms_cstime=0}) = 7422615891

28268 11:47:03 socket(PF_FILE, SOCK_STREAM, 0) = 4

28268 11:47:03 access("/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth", F_OK) = 0

28268 11:47:03 connect(4, {sa_family=AF_FILE, path="/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth"...}, 110

  



CAUSE



Analysing  the strace output, looks like it was trying to write to a socket.



========

28268 11:47:03 socket(PF_FILE, SOCK_STREAM, 0) = 4

28268 11:47:03 access("/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth", F_OK) = 0

28268 11:47:03 connect(4, {sa_family=AF_FILE, path="/var/tmp/.oracle/sSYSTEM.evm.acceptor.auth"...}, 110   <<<<<<<

========

This, indicates a problem with the network socket file.



SOLUTION



Get the PID of evmd.bin process and kill it



$ ps -ef | grep 'd.bin'



oracle   21046 21045  0  2012 ?        00:07:46 /u01/app/ract/crs/bin/evmd.bin         


root     21054 15845  0  2012 ?        11:34:47 /u01/app/ract/crs/bin/crsd.bin reboot

oracle   22072 21453  0  2012 ?

05:44:50 /u01/app/ract/crs/bin/ocssd.bin

root     22135     1  0  2012 ?

00:00:00 /u01/app/ract/crs/bin/oclskd.bin

oracle   22410     1  0  2012 ?        00:00:00 /u01/app/ract/crs/bin/oclskd.bin

oracle   29834 27854  0 13:22 pts/8    00:00:00 egrep d.bin



$ kill -9 21046



After killing evmd.bin process, the command "crsctl check crs" returns the complete output without any hangs.



[oracle@srv03401 bin]$ ./crsctl check crs



CSS appears healthy

CRS appears healthy

EVM appears healthy

&quot;crsctl check crs&quot; command hangs at EVMD check的更多相关文章

  1. 【oracle 11G Grid 】Crsctl start cluster 和 crsctl start crs 有差别么?

     [oracle 11G Grid ]Crsctl start cluster 和 crsctl start crs 有差别么? q:Crsctl start cluster 是 11.2新特性和 ...

  2. (转)Could not execute auto check for display colors using command /usr/bin/xdpyinfo. Check if the DISPL

    转自:http://blog.csdn.net/huashnag/article/details/9357517 Starting Oracle Universal Installer... Chec ...

  3. 图形化界面安装oracle报错Could not execute auto check for display colors using command /usr/bin/xdpyinfo. Check if the DISPLAY variable is set.

    问题描述: 在Linux + oracle 安装时,采有root 帐号登录x-windows 界面,然后 $su oracle 登录录安装Oracle 报以下错误: >>> Coul ...

  4. webservices接口 file &quot;/axis2-web/listsingleservice.jsp&quot; not found 问题解决

    搞了半天 ,原来是services.xml  配置的某个或者某些service 在代码中不存才.扫描的时候找不到对应的service代码所以就会报错

  5. &quot;undefined reference to&quot; 问题解决方法

    近期在Linux下编程发现一个诡异的现象,就是在链接一个静态库的时候总是报错,类似以下这种错误: (.text+0x13): undefined reference to `func' 关于undef ...

  6. 模仿微信&quot;转你妹&quot;游戏

    <!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8&quo ...

  7. SUSE 在Intel举行&quot;Rule The Stack&quot;的竞赛中获得 &quot;Openstack安装最高速&quot;奖

    有关"Rule The Stack": https://communities.intel.com/community/itpeernetwork/datastack/blog/2 ...

  8. android &quot;Missing type parameter&quot; 错误

    近期在做android应该的时候出现这个问题,分析了一下日志,发现是在gosn解析的时候会出现,并且出现的时候非常诡异.于是去网上找相关资料. 发现这个问题还是比較常见的,原来是公布版本号和非正式公布 ...

  9. Cookie rejected: Illegal path attribute &quot;/nexus&quot;. Path of origin: &quot;/content/&quot; 解

    问题叙述性说明 通过运行"mvn clean deploy" 命令 将 Maven 项目公布 Nexus 当PW.举例控制台输出以下警告消息: [INFO] Downloaded: ...

随机推荐

  1. Sql server2005 优化查询速度50个方法小结

    Sql server2005 优化查询速度50个方法小结   Sql server2005优化查询速度51法查询速度慢的原因很多,常见如下几种,大家可以参考下.   I/O吞吐量小,形成了瓶颈效应.  ...

  2. asp.net 输出微信自定义菜单json

    这里使用LitJson.dll作json解析. 微信规定的自定义菜单json样式如下: { "button":[ { "type":"click&qu ...

  3. C#应用视频教程1.2 Socket通信客户端实现

    接下来我们尝试实现最简单的Socket客户端,为了确保只可能你的代码有问题,服务器要先用别人成熟的代码测试(这也是编程的一个技巧,先不要用自己写的客户端测试自己写的服务器,这样出了问题你也不知道谁有问 ...

  4. C#基础视频教程7.2 如何编写简单游戏

    前面一小节我们实现了简单的碰撞检测,但是实际上游戏的对象并不是一个标准的矩形(小鸟是一个不规则的物体,其实碰撞的管道也是不规则物体),所以如果真的要做的比较完美,我们自己要写一个方法,能够导入一个图像 ...

  5. 阅读《Android 从入门到精通》(24)——切换图片

    切换图片(ImageSwitcher) java.lang.Object; android.view.View; android.widget.ViewGroup; android.widget.Fr ...

  6. 嵌入式 如何定位死循环或高CPU使用率(linux)

    如何定位死循环或高CPU使用率(linux)  确定是CPU过高 使用top观察是否存在CPU使用率过高现象 找出线程 对CPU使用率过高的进程的所有线程进行排序 ps H -e -o pid,tid ...

  7. 【转】使用python进行多线程编程

    1. python对多线程的支持 1)虚拟机层面 Python虚拟机使用GIL(Global Interpreter Lock,全局解释器锁)来互斥线程对共享资源的访问,暂时无法利用多处理器的优势.使 ...

  8. Android调用OCR识别图像中的文字

    // CharacterExtractor.java // Copyright (c) 2010 William Whitney // All rights reserved. // This sof ...

  9. UML学习(一)-工具介绍

    这里用于学习UML的工具是StarUML,没有什么原因为什么要用它,或许仅仅是有人说好用和比较小. 首先介绍下这个工具,来张图. 1.菜单栏(最上面) 2.快捷工具栏(菜单栏下面) 3.工具项(Too ...

  10. PHP开发学习门户第三版UI正式上线

    官网:http://www.phpthinking.com/ 论坛:http://bbs.phpthinking.com/ 迭代.迭代,似魔鬼的步伐.似魔鬼的步伐-- PHP开发学习门户第二版UI用了 ...