Sfaturi Oracle / Tips and tricks OracleDBA: gipchaLowerProcessNode: no valid interfaces found to node

2011-05-04

gipchaLowerProcessNode: no valid interfaces found to node

11.2.0.2 Grid Infrastructure upgrade/install on >1 node cluster failing with "gipchaLowerProcessNode: no valid interfaces found to node" in crsd.log (Doc ID 1280234.1)

Symptoms

11.2.0.2 grid infrastructure upgrade or install on >1 node cluster
rootcrs.pl is failing and the following is found in the crsd.log

...

2010-11-29 10:52:38.603: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2614824036 ms, node 111ea99b0 { host 'racdb1', haName '1e0b-174e-37bc-a515', srcLuid 2612fa8e-3db4fcb7, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [55 : 55], createTime 2614768983, flags 0x4 }

2010-11-29 10:52:42.299: [ CRSMAIN][515] Policy Engine is not initialized yet!

2010-11-29 10:52:43.554: [ OCRMAS][3342]proath_connect_master:1: could not yet connect to master retval1 = 203, retval2 = 203

2010-11-29 10:52:43.554: [ OCRMAS][3342]th_master:110': Could not yet connect to new master [1]

2010-11-29 10:52:43.605: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2614829038 ms, node 111ea99b0 { host 'racdb1', haName '1e0b-174e-37bc-a515', srcLuid 2612fa8e-3db4fcb7, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [60 : 60], createTime 2614768983, flags 0x4 }

2010-11-29 10:52:43.754: [ OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203

2010-11-29 10:52:43.955: [ OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203

...

2010-11-29 11:13:49.817: [ OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203

2010-11-29 11:13:50.018: [ OCRMAS][3342]proath_master:100b: Polling, connect to master not complete retval1 = 203, retval2 = 203

...

Changes

Upgrade or install of 11.2.0.2 grid infrastructure on >1 node cluster

Cause

2 causes found for this symptom. One cause is AIX-specific and the other cause is Unix-generic

1) AIX-specific cause

udp_sendspace is set as default 9216, it is smaller than 10240 bytes which is the size used by CRS.

#no -o udp_sendspace

will show the current setting

2) UNIX-generic cause

Netmask mismatch between the nodes. The private interface must have the same netmask on all nodes. Mismatch between netmask on different nodes can cause this symptom.

Solution

The two causes have two separate solutions.

1) Solution for AIX-specific cause

Increase udp_sendspace to >= 10240.

# no -o udp_sendspace=65536

Note that the 11gR2 documentation instructs to set udp_sendspace to 65536:

Network tuning parameter	Recommended value
ipqmaxlen	512
rfc1323	1
sb_max	4194304
tcp_recvspace	65536
tcp_sendspace	65536
udp_recvspace	655360
udp_sendspace	65536

See Oracle Grid Infrastructure Installation Guide
11g Release 2 (11.2) for IBM AIX on POWER Systems (64-Bit)
2.11.7 Configuring Network Tuning Parameters
http://download.oracle.com/docs/cd/E11882_01/install.112/e17210/preaix.htm#CWAIX219
for more details.
If problem happens during rootupgrade.sh (usually on 2nd node), please do:
1). Increase udp_sendspace to 65536:

#no -o udp_sendspace=65536

2). Stop CRS on both nodes:

# crsctl stop crs -f

# ps -ef |grep d.bin - to ensure there is no left over CRS process

3). Restart CRS on node 1:

# crsctl start crs

wait till CRS start on node 1.

4). On node 2, rerun rootupgrade.sh

# rootupgrade.sh

It should complete on node 2 this time.

2) Solution for Unix-generic cause

Check that netmask matches on private interface on all nodes.

 [grid@mynode1 ~]$ ifconfig eth1

eth1      Link encap:Ethernet  HWaddr 00:19:B9:1E:6D:97

inet addr:192.168.1.110  Bcast:192.168.1.255  Mask:255.255.255.0

...

[grid@mynode2 ~]$ ifconfig eth1

eth1      Link encap:Ethernet HWaddr 00:19:B9:1E:6D:97

inet addr:192.168.1.111 Bcast:192.168.1.255 Mask:255.255.255.0

...

In case of mismatch, customer sysadmin must correct the netmask on the private interface(s) where it's wrong.

Sfaturi Oracle / Tips and tricks OracleDBA

2011-05-04

gipchaLowerProcessNode: no valid interfaces found to node

11.2.0.2 Grid Infrastructure upgrade/install on >1 node cluster failing with "gipchaLowerProcessNode: no valid interfaces found to node" in crsd.log (Doc ID 1280234.1)

Symptoms

Changes

Cause

1) AIX-specific cause

2) UNIX-generic cause

Solution

1) Solution for AIX-specific cause

2) Solution for Unix-generic cause

Niciun comentariu:

Trimiteți un comentariu

Tags

Arhivă blog

About me

Lista mea de bloguri

Persoane interesate