11g R2 RAC: Highly Available IP (HAIP)

In earlier releases, to minimize node evictions due to frequent private NIC down events, bonding, trunking, teaming, or similar technology was required to make use of redundant network connections between the nodes. Oracle Clusterware now provides an integrated solution which ensures “Redundant Interconnect Usage” as it supports IP failover .

Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg. The ora.cluster_interconnect.haip resource will pick up a  highly available virtual IP (the HAIP) from “link-local” (Linux/Unix)  IP range (169.254.0.0 ) and assign to each private network.   With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces. If a private interconnect interface fails or becomes non-communicative, then Clusterware transparently moves the corresponding HAIP address to one of the remaining functional interfaces.

Grid Infrastructure can activate a maximum of four private network adapters at a time even if more are defined. The number of HAIP addresses is decided by how many private network adapters are active when Grid comes up on the first node in the cluster .  If there’s only one active private network, Grid will create one;  if two, Grid will create two and so on. The number of HAIPs won’t increase beyond four even if more private network adapters are activated . A restart of clusterware on all nodes is required for new adapters to become effective.

Oracle RAC Databases, Oracle Automatic Storage Management (clustered ASM), and Oracle Clusterware components such as CSS, OCR, CRS, CTSS, and EVM components employ Redundant Interconnect Usage.  Non-Oracle software and Oracle software not listed above, however, will not be able to benefit from this feature.

Let’s demonstrate :

Current configuration :

Cluster name : cluster01
nodes : host01, host02

– Overview
– check current network network configuration
– check that a link local HAIP (eth1:1 ) has been started for the only private interconnect eth1   on both the nodes
– Add another network adapter eth2 to both the nodes
– Assign IP address to eth2 on both the nodes
– Restart network service on both the nods
– Check that eth2 has been activated on both the nodes
– Add eth2 to as another private interconnect on one of  the nodes
– check that eth2 has been added to the cluster as another private interconnect
– check that HAIP has not been activated yet (c/ware needs to be restarted)
– Restart crs on both the nodes
– Check that the resource ora.cluster_interconnect.haip has been restarted on both the nodes
– check that a link local HAIPs(eth1:1 and eth2:1) have been started for  both the private  interconnects eth1 and eth2   on both the nodes from the subnet 169.254.*.* reserved for HAIP
– stop private interconnect on eth1 on node1
– check that eth1 is not active and corresponding HAIP has failed over to eth2
— check that crs is still up on host01

Implementation

- Check current network configuration

eth0 is public interconnect
eth1 is private interconnect

[root@host01 ~]# oifcfg getif
eth0  192.9.201.0  global  public
eth1  10.0.0.0  global  cluster_interconnect

- check that a link local HAIP (eth1:1 ) has been started for the only private interconnect eth1 on both the nodes

[root@host01 ~]# ifconfig -a

(output trimmed to show only private interconnect)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:AA
inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe69:3eaa/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:134731 errors:0 dropped:0 overruns:0 frame:0
TX packets:116938 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:75265764 (71.7 MiB)  TX bytes:55228739 (52.6 MiB)
Interrupt:75 Base address:0x20a4

eth1:1    Link encap:Ethernet  HWaddr 00:0C:29:69:3E:AA
inet addr:169.254.4.103  Bcast:169.254.127.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x20a4

[root@host02 network-scripts]# ifconfig -a

(output trimmed to show only private interconnects)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:44:67:25
inet addr:10.0.0.2  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe44:6725/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:31596 errors:0 dropped:0 overruns:0 frame:0
TX packets:32994 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:16550162 (15.7 MiB)  TX bytes:17683576 (16.8 MiB)
Interrupt:75 Base address:0x20a4

eth1:1    Link encap:Ethernet  HWaddr 00:0C:29:44:67:25
inet addr:169.254.91.243  Bcast:169.254.127.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x20a4

- Add another network adapter eth2 to both the nodes

- Assign IP address to eth2 on both the nodes
host01 : 10.0.0.11, subnet mask : 255.255.255.0
host02 : 10.0.0.22, subnet mask : 255.255.255.0

- Restart network service on both the nodes

#service network restart

- Check that eth2 has been activated on both the nodes

[root@host01 ~]# ifconfig -a |  grep eth2
eth2      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4

[root@host02 network-scripts]# ifconfig -a |  grep eth2

eth2      Link encap:Ethernet  HWaddr 00:0C:29:44:67:2F

- Add eth2 to as another private interconnect on one of  the nodes

[root@host01 ~]# oifcfg setif -global eth2/10.0.0.0:cluster_interconnect

- check that eth2 has been added to the cluster as another private interconnect

[root@host01 ~]# oifcfg getif

eth0  192.9.201.0  global  public
eth1  10.0.0.0  global  cluster_interconnect
eth2  10.0.0.0  global  cluster_interconnect

- check that HAIP has not been activated yet (c/ware needs to be restarted)

[root@host01 ~]# ifconfig -a |  grep eth2
eth2      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4

[root@host02 network-scripts]# ifconfig -a |  grep eth2

eth2      Link encap:Ethernet  HWaddr 00:0C:29:44:67:2F

- Restart crs on both the nodes

[root@host01 ~]# crsctl stop crs
crsctl start crs

[root@host02 network-scripts]# crsctl stop crs
crsctl start crs

- Check that the resource ora.cluster_interconnect.haip has been restarted on both the nodes
(Since it is a resource of lower stack, -init option has been used)

[root@host01 ~]# crsctl stat res ora.cluster_interconnect.haip -init

NAME=ora.cluster_interconnect.haip
TYPE=ora.haip.type
TARGET=ONLINE
STATE=ONLINE on host01

- check that a link local HAIPs(eth1:1 and eth2:1) have been started for  both the private interconnects eth1 and eth2 on both the nodes from the subnet 169.254.*.* reserved for HAIP

[root@host01 ~]# ifconfig -a

(output trimmed to show only private interconnects)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:AA
inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe69:3eaa/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:134731 errors:0 dropped:0 overruns:0 frame:0
TX packets:116938 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:75265764 (71.7 MiB)  TX bytes:55228739 (52.6 MiB)
Interrupt:75 Base address:0x20a4

eth1:1    Link encap:Ethernet  HWaddr 00:0C:29:69:3E:AA
inet addr:169.254.4.103  Bcast:169.254.127.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x20a4

eth2      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4
inet addr:10.0.0.11  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe69:3eb4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:4358 errors:0 dropped:0 overruns:0 frame:0
TX packets:404 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1487549 (1.4 MiB)  TX bytes:76461 (74.6 KiB)
Interrupt:75 Base address:0x2424

eth2:1    Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4
inet addr:169.254.196.216  Bcast:169.254.255.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x2424

[root@host02 network-scripts]# ifconfig -a

(output trimmed to show only private interconnects)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:44:67:25
inet addr:10.0.0.2  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe44:6725/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:31596 errors:0 dropped:0 overruns:0 frame:0
TX packets:32994 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:16550162 (15.7 MiB)  TX bytes:17683576 (16.8 MiB)
Interrupt:75 Base address:0x20a4

eth1:1    Link encap:Ethernet  HWaddr 00:0C:29:44:67:25
inet addr:169.254.91.243  Bcast:169.254.127.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x20a4

eth2      Link encap:Ethernet  HWaddr 00:0C:29:44:67:2F
inet addr:10.0.0.22  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe44:672f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:7229 errors:0 dropped:0 overruns:0 frame:0
TX packets:2368 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4288301 (4.0 MiB)  TX bytes:1163296 (1.1 MiB)
Interrupt:75 Base address:0x2424

eth2:1    Link encap:Ethernet  HWaddr 00:0C:29:44:67:2F
inet addr:169.254.174.223  Bcast:169.254.255.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x2424

— stop private interconnect on node1

[root@host01 ~]# ifdown eth1

-- check that eth1 is not active and corresponding HAIP (169.254.4.103) has failed over to eth2

[root@host01 ~]# ifconfig -a

(output trimmed to show private interconnect only)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:AA
BROADCAST MULTICAST  MTU:1500  Metric:1
RX packets:163401 errors:0 dropped:0 overruns:0 frame:0
TX packets:145495 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:89098576 (84.9 MiB)  TX bytes:69881778 (66.6 MiB)
Interrupt:75 Base address:0x20a4

eth2      Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4
inet addr:10.0.0.11  Bcast:10.0.0.255  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe69:3eb4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:11649 errors:0 dropped:0 overruns:0 frame:0
TX packets:4738 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6370975 (6.0 MiB)  TX bytes:2033237 (1.9 MiB)
Interrupt:75 Base address:0x2424

eth2:1    Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4
inet addr:169.254.196.216  Bcast:169.254.255.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x2424

eth2:2    Link encap:Ethernet  HWaddr 00:0C:29:69:3E:B4
inet addr:169.254.4.103  Bcast:169.254.127.255  Mask:255.255.128.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
Interrupt:75 Base address:0x2424

– check that crs is still up on host01

[root@host01 ~]# crsctl stat res -t

References:

http://www.oracle.com/technetwork/products/clusterware/overview/oracle-clusterware-11grel2-owp-1
-129843.pdf

http://ora-ssn.blogspot.in/2011/09/redundant-interconnect-usage-in-11g-r2.html

http://oraschool.tistory.com/38

————————————————————————————————-

Related Links:

Home

11g R2 RAC Index
11g R2 RAC: NIC Bonding

 

———————————————————–

6 thoughts on “11g R2 RAC: Highly Available IP (HAIP)

  1. Nice article. Your example is technically correct, but has the two private interfaces for the private interconnect in the same subnet:

    eth1 10.0.0.0 global cluster_interconnect
    eth2 10.0.0.0 global cluster_interconnect

    “10.0.0.0” – the Redundant Interconnect Usage Feature as the HAIP feature is officially called would say that the networks must be in different subnets.

    The wording has changed from “should be” to “must be in different subnets” after 11.2.0.2.

    Just a thought. Thanks,
    Markus

    1. Here, I would like to bring to your kind notice that in OU’s lab setup provided for practice 4.2 for course (D59999GC30) Oracle Grid Infrastructure 11g : Manage clusterware and ASM , both the private interfaces ( eth1 and eth2)for private interconnects have been configured on same subnet.

      Regards
      Anju Garg

  2. I mean this statement:
    The wording has changed from “should be” to “must be in different subnets” after 11.2.0.2.

  3. Its worth considering below note – it discusses the issue with same subnet for redundant pips

    11gR2 CSS Terminates/Node Eviction After Unplugging one Network Cable in Redundant Interconnect Environment (Doc ID 1481481.1)

    Regards
    Charan

Your comments and suggestions are welcome!