ASM device error ORA-27041 ORA-15025 ORA-15081 (Doc ID 1487475.1)

描述总结:
数据库的alert中发现大量ORA-27041 ORA-15025 ORA-15081报错,首先查看asm的磁盘组的状态,对应的盘符的状态和权限全部正常,查看asm的alert日志并未看到刷新。
炸一看像是磁盘权限的问题,但是细想下来,如果真的是磁盘权限的问题,那么数据库应该就会挂了,但是查看业务的会话,全部正常。此时没有头绪,查看mos,其实已经有了答案,但是没有理解mos的意思。
后坚持30分钟沟通的原则,沟通是否有磁盘相关的变更。根据trc中的uid,确认与操作系统DSG用户相关,咨询业务也未有明确答案。查看该用户下的进程,发现一个大量执行的脚本,查看相应的log,发现程序执行的时间和数据库报错的时间吻合,并且报错一致。最终判定,是由于dsg用户缺少磁盘组的相应权限导致,该程序访问磁盘提示权限不足。

相关信息如下:
1.Environment:11.2.0.4
2.Symptoms:
1)报错
Tue Sep ORA-2704128 11:08:13 2021
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 r
eason error; if possible, will try another mirror side
Errors in file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc:
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Tue Sep 28 11:08:14 2021

2)磁盘相关查询
SQL> select DISK_NUMBER,B.NAME GROUP_NAME,a.name diskname,a.free_MB/1024 FREE_GB,a.TOTAL_MB/1024 TOTAL_GB,a.free_mb/a.total_mb*100 free_percentage ,a.path,A.STATE FROM V$ASM_DISK A,V$ASM_DISKGROUP B WHERE A.GROUP_NUMBER=B.GROUP_NUMBER order by b.name,DISK_NUMBER ;

DISK_NUMBER GROUP_NAME DISKNAME FREE_GB TOTAL_GB FREE_PERCENTAGE PATH STATE
----------- --------------- --------------- --------- --------- --------------- -------------------- ------------------------
0 DATA DATA_0000 27.23 1024.00 2.66 /dev/rhdisk10 NORMAL
1 DATA DATA_0001 27.12 1024.00 2.65 /dev/rhdisk11 NORMAL
2 DATA DATA_0002 27.05 1024.00 2.64 /dev/rhdisk12 NORMAL
……
0 OCR OCR_0000 9.62 10.00 96.21 /dev/rhdisk2 NORMAL
73 rows selected.

SQL> select group_number, name,state,type,total_MB / 1024 total_GB,free_mb / 1024 FREE_GB,free_mb / total_MB * 100 free_per,(case when free_mb / total_mb * 100 < 15 then '*' else '' end) care from V$ASM_DISKGROUP;

GROUP_NUMBER NAME STATE TYPE TOTAL_GB FREE_GB FREE_PER C
------------ ------------------------------ ----------- ------ ---------- ---------- ---------- -
1 DATA CONNECTED EXTERN 73728 1915.58496 2.59817839 *
2 OCR MOUNTED EXTERN 10 9.62109375 96.2109375

-bash-4.2# ls -l /dev/hdisk1*
brw------- 1 root system 13, 3 Dec 17 2020 /dev/hdisk1
brw------- 1 root system 13, 16 Dec 23 2020 /dev/hdisk10
brw------- 1 root system 13, 14 Dec 23 2020 /dev/hdisk11
brw------- 1 root system 13, 19 Dec 23 2020 /dev/hdisk12

-bash-4.2# ls -l /dev/rhdisk1*
crw------- 1 root system 13, 3 Dec 17 2020 /dev/rhdisk1
crw-rw---- 1 grid asmadmin 13, 16 Sep 28 11:17 /dev/rhdisk10
crw-rw---- 1 grid asmadmin 13, 14 Sep 28 11:17 /dev/rhdisk11
crw-rw---- 1 grid asmadmin 13, 19 Sep 28 11:17 /dev/rhdisk12
crw-rw---- 1 grid asmadmin 13, 8 Sep 28 11:17 /dev/rhdisk13
-bash-4.2# errpt

-bash-4.2# lsattr -El hdisk12 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True
-bash-4.2# lsattr -El hdisk13 | grep reserve_policy
reserve_policy no_reserve Reserve Policy True

-bash-4.2# id oracle
uid=1101(oracle) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba)
-bash-4.2# id grid
uid=1100(grid) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)
-bash-4.2#

3.定位原因:trc中显示kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000(对应dsg用户为oinstall属组,未包括asm相关属组)

Trace file /XXX/app/oracle/diag/rdbms/xxx/xxx1/trace/xxx1_ora_9178000.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /XXX/app/oracle/product/11.2/db_2
System name: AIX
Node name: hxdg1
Release: 1
Version: 7
Machine: 00F6E6DC4C00
Instance name: xxx1
Redo thread mounted by this instance: 1
Oracle process number: 194
Unix process pid: 9178000, image: oracle@hxdg1 (TNS V1-V3)

*** 2021-09-28 11:08:13.726
*** SESSION ID:(988.19) 2021-09-28 11:08:13.726
*** CLIENT ID:() 2021-09-28 11:08:13.726
*** SERVICE NAME:(SYS$USERS) 2021-09-28 11:08:13.726
*** MODULE NAME:(oxad@hxdg1 (TNS V1-V3)) 2021-09-28 11:08:13.726
*** ACTION NAME:() 2021-09-28 11:08:13.726

WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: failed to open a disk[/dev/rhdisk12]
ORA-15025: could not open disk "/dev/rhdisk12"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 3
Additional information: 4
Additional information: 138444804
kfk_debug_get_user_groups: uid:207, euid:1101, gid:1000, egid:1000
WARNING: disk locally closed resulting in I/O error
WARNING: Read Failed. group:1 disk:2 AU:893192 offset:16384 size:16384
path:Unknown disk
incarnation:0x4560e025 synchronous result:'I/O error'
subsys:Unknown library iop:0x110db7ed0 bufp:0x110cfde00 osderr:0x0 osderr1:0x0
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 3103 in group [1.374353710] from disk DATA_0002 allocation unit 893192 reason error; if possible, will try another mirror side
DDE rules only execution for: ORA 202
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
DDE Action 'DB_STRUCTURE_INTEGRITY_CHECK' was flood controlled
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (FLOOD CONTROLLED, 1 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----

*** 2021-09-28 11:08:13.731
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
----- Current SQL Statement for this session (sql_id=amdwhucub5mzk) -----
select count(:"SYS_B_0") from v$dataguard_stats

cat /etc/passwd | grep uid
ps -ef |grep dsg

-bash-4.2# ps -ef | grep dsg
dsg 8455834 2033984 0 09:58:37 - 0:01 sshd: dsg@pts/15
dsg 11405170 11733628 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod
root 11536588 13501954 0 13:04:45 pts/28 0:00 grep dsg
root 2033984 3473680 0 09:58:36 - 0:00 sshd: dsg [priv]
dsg 6752538 8455834 0 09:58:37 pts/15 0:00 -ksh
dsg 11733628 1 0 12:33:23 - 0:00 /XXX/dsg/aiod/bin/aiod -n 127.0.0.1,45007 -flog /XXX/dsg/aiod/log/log.aiod

more /XXX/dsg/aiod/log/log.aiod
encrypt_pwd=n # input oracle password is encrypted?
oracle_pdb= # Oracle 12c Instance PDB name
sysdba=n # connect Oracle by SYSDBA?(Y|N)
sysasm=n # connect Oracle by SYSASM?(Y|N)
timeout=6 # recv oxad info timeout minutes. 0 - unused timeout, (0 ~ 254)
standby=A # A - auto check, Y - standby DB, N - not standby DB
[I] 2021-09-28:11:55:49 AOXD XP#5834878 loop startup ...
[I] 2021-09-28:11:55:49 asm#21366664 startup(edn:1,rawofs:0) blen(24:4:18)MB 16777216, 149ms
[I] 2021-09-28:11:55:49 ASM APIs startup success! used ASM APIs, 0.17s
[I] 2021-09-28:11:55:49 ASM#21366664 (0,1,255,0) 0, 0.00M, s(0.003s, )
[E] 2021-09-28:11:55:49 AOXD loop err:OXA-1000 OCI for Oracle error -1 occurred at api/sql/execute.c:112, sid xxx1, tns .
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+DATA/xxx/controlfile/current.257.1062864093.bak'
ORA-15081: failed to submit an I/O operation to a disk
Error - OCI_ERROR select count(1) from v$dataguard_stats
[I] 2021-09-28:11:55:49 AOXD service shutdown.
[W] 2021-09-28:11:55:49 Task 0, pid 21366664 exit, (normal exit 0) sleep 30s, and restart.

最终解决确认权限,不再报错
-bash-4.2# id dsg
uid=207(dsg) gid=1000(oinstall) groups=1100(asmadmin),1200(dba),1300(asmdba),1301(asmoper)

11.2.0.4 ORA-15025 ORA-27041 IBM AIX RISC System/6000 Error: 13: Permission denied的更多相关文章

  1. 解决Mac nginx问题 [emerg] 54933#0: bind() to 0.0.0.0:80 failed (13: Permission denied)

    brew services restart nginx Stopping nginx... (might take a while) ==> Successfully stopped nginx ...

  2. Ubuntu nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

    在Ubuntu 12中启动刚安装好的Nginx,报错: nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied) 原因如下: ...

  3. 解决Nginx的connect() to 127.0.0.1:8080 failed (13: Permission denied) while connect

    在进行Nginx+Tomcat 负载均衡的时候遇到了这个权限问题,在error.log日志中.我们能够看到例如以下: connect() to 127.0.0.1:8080 failed (13: P ...

  4. 解决nginx访问问题connect() to 127.0.0.1:8080 failed (13: Permission denied) while connecting to upstream,

    问题:搭建好项目之后,用nginx进行代理,进行日常配置之后,发现前端正常访问,但是后端访问出现错误,报502错误,查找nginx日志,发现connect() to 127.0.0.1:8080 fa ...

  5. Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) nginx 启动失败

     Starting nginx: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)     nginx 启动失败 ...

  6. nginx bind() to 0.0.0.0:**** failed (13: Permission denied)

    nginx 启动失败,日志里面报错信息如下: Starting nginx: nginx: [emerg] bind() to 0.0.0.0:**** failed (13: Permission ...

  7. "/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission denied), client: 10.0.4.118, server: localhost, request: "GET / HTTP/1.1"

    openrestry 安装之后 报"/usr/local/openresty/nginx/html/index.html" is forbidden (13: Permission ...

  8. nginx之 [error] 6702#0:XXX is forbidden (13: Permission denied)

    问题描述: 配置完 nginx 两个虚拟机后,客户端能够访问原始的server ,新增加的 server 虚拟机 不能够访问,报错如下页面 解决过程: 1. 查看报错日志[root@mysql03 n ...

  9. 解决nginx报错:nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied)

    报错描述: nginx: [emerg] bind() to 0.0.0.0:8088 failed (13: Permission denied) 通过ansible远程给主机更换端口并重新启动ng ...

随机推荐

  1. 连接共享打印机失败错误代码0x80070035

    局域网内共享打印机非常方便,但是在连接中经常遇到问题,其中出现错误代码0x80070035的概率非常之高! 1.必须确保有关打印功能的相关服务都处于自动启动状态,重点检查TCP/IP NetBIOS ...

  2. Inject-APC (Ring3)

    1 // APCInject.cpp : 定义控制台应用程序的入口点. 2 // 3 4 #include "stdafx.h" 5 #include "APCInjec ...

  3. Java HdAcm1174

    空间一般直线的方程是:(x-x0)/a=(y-y0)/b=(z-z0)/c,这是一条过(x0,y0,z0),方向矢量为{a,b,c}的直线.假设已知点的坐标是A(e,f,g),过A点,且与{a,b,c ...

  4. java 循环移位输出全排列

    //题目:利用1.2.2.3.4这4个数字,用java写一个main函数打印出所有不同的排列,如12234,,2234等,要求打印出来不能有重复 1 package test123; 2 3 impo ...

  5. js调试之firbug

    说下几种方法吧: 1.用alert 这个最最直观 把你想要的内容弹出来给你看,但是要看哪里 就要在哪里加,比较麻烦 2.用firefox 或者chrome浏览器 里面有debug工具的 3.如果想用i ...

  6. vue-cli3.x中的webpack配置,优化及多页面应用开发

    官方文档 vue-cli3以下版本中,关于webpack的一些配置都在config目录文件中,可是vue-cli3以上版本中,没有了config目录,那该怎么配置webpack呢? 3.x初始化项目后 ...

  7. python 中最好用的身份证规则解析工具,地区码、性别、出生年月、身份证编码等快速校验!

    安装并导入依赖库 # pip install parseIdCard from parseIdCard import parseIdCard from pprint import pprint 地区码 ...

  8. 【曹工杂谈】详解Maven插件调试方法

    前言 今年的更新频率简直是降至冰点了,一方面平时加班相对多一些了,下班只想玩手机:另一方面,好像进了大厂后,学习动力也很低了,总之就,很懒散,博客的话,今年都才只更新了不到5篇. 现在慢慢有一点状态, ...

  9. 启动线程组报错:Error occurred starting thread group :test_1, error message:Invalid duration 0 set in Thread Group:test_1, see log file for more details

    线程组基础信息都已经配置好,启动时报错,如下图: 排查原因:勾选了线程组调度器,并未设置参数 解决方案:取消勾选或者设置参数

  10. 20210717 noip18

    考前 从小饭桌出来正好遇到雨下到最大,有伞但还是湿透了 路上看到一个猛男搏击暴风雨 到了机房收拾了半天才开始考试 ys 他们小饭桌十分明智地在小饭桌看题,雨下小了才来 考场 状态很差. 开题,一点想法 ...