Prior to, during failures of certain Oracle RAC-required subcomponents (e.g. private interconnect, voting disk etc.) , Oracle Clusterware tried to prevent a split-brain with a fast reboot of the server(s) without waiting for ongoing I/O operations or synchronization of the file systems. As a result, non-cluster-aware applications would be forcibly shut down. Moreover, during reboots, resources need to re-mastered across the surviving nodes . In a big cluster with many numbers of nodes, this can be potentially a very expensive operation.

This mechanism has been changed in version (first 11g Release 2 patch set).

After deciding which node to evict,

– the clusterware will attempt to clean up the failure within the cluster by killing only the offending process(es) on that node . Especially I/O generating processes are killed .

– If all oracle resources/processes can be stopped and all IO generating processes can be killed,

  • clusterware resources will stop on the node
  • Oracle High Availability Services Daemon will keep on trying to restart the  Cluster Ready Services (CRS) stack again.
  • Once the conditions to start  CRS stack are re-established, all relevant cluster resources on that node will automatically start.

– If, for some reason, not all resources can be stopped or IO generating processes cannot be stopped completely (hanging in kernel mode, I/O path, etc.) ,

  • Oracle Clusterware will still perform a reboot or use IPMI to forcibly evict the node from the cluster as earlier.

This behavior change is particularly useful for non-cluster aware applications as the data will be protected by shutting down the cluster only on the node without rebooting the node itself.

I will demonstrate this functionality in two scenarios :

Failure of network heartbeat
Failure of DISK heartbeat




Related Links:


11g R2 RAC Index

11g R2 RAC: Node Eviction Due To Missing Network Heartbeat 
 11g R2 RAC: Reboot-less Fencing With Missing Network Heartbeat
11g R2 RAC :Reboot-less  Fencing With Missing Disk Heartbeat




Your comments and suggestions are welcome!