2013-06-18

ORA-29701 hp-ux solaris





Description

 In 11.2 environment on Solaris or HP-UX, IO to ASM file occasionally fails with ORA-29701 and/or ORA-15032

Occurrence

This issue occurs in Solaris or HP-UX environment only
Only affects environments running Grid Infrastructure 11.2.0.1 - 11.2.0.3, and only when ASM is used for datafiles
Only seen when there is a high number of concurrent connections and high load on system

Symptoms

11gR2 Grid Infrastructure environment, error ORA-29701 reported in ASM or databases (including pre-11.2 databases
  • ASM/database alert.log
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process nnnnnOR 
ERROR: unrecoverable error ORA-15032 raised in ASM I/O path; terminating process nnnnn
  • ASM/database trace file 
2012-07-12 14:59:23.784: [ CSSCLNT]clssscConnect: gipcWait failed with 16 (12)
2012-07-12 14:59:23.795: [ CSSCLNT]clsssInitNative: connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ssm0115_)) failed, rc 16
kgxgncin: CLSS init failed with status 3
kgxgncin: return status 3 (1311719766 SKGXN not av) from CLSS
NOTE: kfmsInit: ASM failed to initialize group services
Error ORA-29701 signaled at ksedsts()+960<-kgepop()+480<-ksesecl0()+68<-kfmsInit()+228<-kfmsSlvReg()+456<-kfmdSlvOpPriv()+7780<-kfmdWriteSubmitted()+780<-kfk_process_an_ioq()+236<-kfk_submit_io()+44<-kfk_io1()+916<-kfk_transitIO()+2496<-kffPreFormat2()+804<-_$c1A.kffPreFormat()+216<-kffFdAddImap()+3572<-kffFdAddMap()+784<-kffFileResize()+8992<-kffFileCreate()+7264<-kffCreate()+1924<-kfnsFileCreate()+1716<-kfnDispatch()+1580<-opiodr()+1164<-ttcpip()+916<-opitsk()+1640<-opiino()+924<-opiodr()+1164<-opidrv()+1032<-sou2o()+88<-opimai_real()+504<-ssthrdmain()+316<-main()+316<-_start()+380
ERROR: unrecoverable error ORA-29701 raised in ASM I/O path; terminating process 20757
----- Abridged Call Stack Trace -----
ksedsts()+1296<-kfk_io1()+2976<-kfk_transitIO()+2496<-kffPreFormat2()+804<-_$c1A.kffPreFormat()+216<-kffFdAddImap()+3572<-kffFdAddMap()+784<-kffFileResize()+8992<-kffFileCreate()+7264<-kffCreate()+1924<-kfnsFileCreate()+1716<-kfnDispatch()+1580<-opiodr()+1164
<-ttcpip()+916<-opitsk()+1640<-opiino()+924<-opiodr()+1164<-opidrv()+1032<-sou2o()+88<-opimai_real()+504<-ssthrdmain()+316<-main()+316<-_start()+380
----- End of Abridged Call Stack Trace -----
  • $GRID_HOME/log/<node>/cssd/ocssd.log
2012-07-12 14:59:29.862: [GIPCXCPT][5] gipcmodClscCallback: async request failed req 10f8e43d0 [000000000da1c332] { gipcSendRequest : addr '', data 10ef6a9d0, len 48, olen 0, parentEndp 10de74f10, ret gipcretConnectionLost (12), objFlags 0x0, reqFlags 0x224 }, ret gipcretConnectionLost (12)
2012-07-12 14:59:29.863: [GIPCXCPT][5] gipcmodMuxTransferAccept: internal accept request failed endp 10028f650, child 10de74f10, ret gipcretConnectionInvalid (13)
2012-07-12 14:59:29.863: [ GIPCMUX][5] gipcmodMuxTransferAccept: EXCEPTION[ ret gipcretConnectionInvalid (13) ]  error during accept on endp 10028f650
  • High number of concurrent connections

Patches

bug 14332688 increases the CSS real time priority than LMS and other DB real time processes to reduce client time out during CSS registrations, it will be fixed in 11.2.0.4 and 12.1.  
   
The solution is to request/download patch 14332688
NOTE: This bug and fix affects only Solaris and HP platforms running Grid Infrastructure 11.2.0.1 - 11.2.0.3
For other possible causes of this error please see Document 1496329.1

Niciun comentariu:

Trimiteți un comentariu