11G R2 RAC: HOW TO IDENTIFY THE MASTER NODE IN RAC

 
                          
In this post, I will demonstrate three methods to identify the oracle clusterware’s master node. Pls note that clusterware master is different from Resource master in  oracle database instance. To know about how to find the resource master, pls click here:.
Importance of master node in a cluster:

- Master node has the least Node-id in the cluster. Node-ids are  assigned to the nodes in the same order as the nodes join the cluster. Hence, normally the node which joins the cluster first is the master node.

  • - CRSd process on the Master node is responsible to initiate the OCR backup as per the backup policy
  • - Master node  is also responsible to sync OCR cache across the nodes
  • - CRSd process oth the master node reads from and writes to OCR on disk
  • - In case of node eviction, The cluster is divided into two sub-clusters. The sub-cluster containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of nodes, the sub-cluster having the master node survives whereas the other sub-cluster is evicted.


Oracle ClusterWare master’s  information can be found 


  • - by scanning ocssd logs from various nodes
  • - by scanning  crsd logs from various nodes. 
  • - by identifying the node which  takes the backup of the OCR.

If master node gets evicted/rebooted, another node becomes the master.

 
I have a 3 node setup. I check the ocssd logs on the 3 nodes for the string ‘master node’ and note that node 3 is the master node.
 
 
[grid@host01 root]$ cat $ORACLE_HOME/log/host01/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:36.949: [    CSSD][2778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 1, master node number 3
 
[root@host02 cssd]# cat $ORACLE_HOME/log/host02/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:36.953: [    CSSD][778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 2, master node number 3
 
[root@host03 ~]# cat $ORACLE_HOME/log/host03/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:37.001: [    CSSD][778700688]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 3, master node number 3
 
If I take the OCR backup right now, it will be taken by node3 (master node).
 
 [root@host02 cssd]# ocrconfig -manualbackup
 
host03     2012/11/24 09:54:48     /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_095448.ocr

Let us check crsd logs of various nodes, looking for the string OCR MASTER.Note that node3 is the master node presently.

 
[grid@host01 crsd]$ cat /u01/app/11.2.0/grid/log/host01/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
 
2012-11-23 10:15:01.403: [  OCRMAS][2877356944]th_master: NEW OCR MASTER IS 3
 
[root@host02 crsd]# cat /u01/app/11.2.0/grid/log/host02/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
 
2012-11-23 10:15:03.561: [  OCRMAS][876976016]th_master: NEW OCR MASTER IS 3
 
[root@host03 crsd]#  cat /u01/app/11.2.0/grid/log/host03/crsd/crsd.log |grep ‘OCR MASTER’ | tail -3
 
2012-11-23 10:11:18.499: [  OCRMAS][877467536]th_master:13: I AM THE NEW OCR MASTER at incar 44. Node Number 3
 
[
Let me reboot node3 and check which node is assigned the mastership now.
 

[root@host03 ~]# init 6

 
check the ocssd logs on the remaining two nodes (node1 and node2) for the string ‘master node’ and note that node 1 is the master node.
[grid@host01 root]$ cat $ORACLE_HOME/log/host01/cssd/ocssd.log |grep ‘master node’ |tail -1
 
2012-11-24 10:09:23.522: [    CSSD][2778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954982 with 2 nodes, local node number 1, master node number 1
 
[root@host02 cssd]# cat $ORACLE_HOME/log/host02/cssd/ocssd.log |grep ‘master node’ |tail -1
 
2012-11-24 10:09:23.502: [    CSSD][778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954982 with 2 nodes, local node number 2, master node number 1
 
As can be seen from ocssd logs of the remaining two nodes, node1 has become the master now.
 
Now If I take the OCR backup, it is taken by node1 while earlier backups were taken by node3 which was the then master.
 
[root@host02 cssd]# ocrconfig -manualbackup
 
host01     2012/11/24 10:12:29     /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_101229.ocr
 
host03     2012/11/24 09:54:48     /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_095448.ocr
 
Let us check crsd logs of various nodes, looking for the string OCR MASTER
 
[grid@host01 crsd]$ cat /u01/app/11.2.0/grid/log/host01/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:08:45.884: [  OCRMAS][877356944]th_master:13: I AM THE NEW OCR MASTER at incar 47. Node Number 1
 
[root@host02 crsd]# cat /u01/app/11.2.0/grid/log/host02/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:08:45.364: [  OCRMAS][876976016]th_master: NEW OCR MASTER IS 1
 
[root@host03 crsd]#  cat /u01/app/11.2.0/grid/log/host03/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:12:20.282: [  OCRMAS][877422480]th_master: NEW OCR MASTER IS 1
 
Hope you enjoyed reading !!
Keep visiting the blog ….
 
Regards
——————————————————————————————–
Related links:

Home

11g R2 RAC: How To Find The Resource Master?


 
                                         ——————
 
 

13 thoughts on “11G R2 RAC: HOW TO IDENTIFY THE MASTER NODE IN RAC

  1. Hi Anju,

    I frequently read your Blog. its really interesting and helpful. I have a doubt about above said desciption i.e.” In case of node eviction, The cluster is divided into two sub-clusters. The sub-cluster containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of nodes, the sub-cluster having the master node survives whereas the other sub-cluster is evicted.”
    Where as my understanding is “Odd number of disk are to avoid split brain, When Nodes in cluster can’t talk to each other they run to lock the Voting disk and whoever lock the more disk will survive, if disk number are even there are chances that node might lock 50% of disk (2 out of 4) then how to decide which node to evict.
    whereas when number is odd, one will be higher than other and each for cluster to evict the node with less number”
    Can you please clarify my doubt as both are contradicting?

    Thanks & Regards
    Syed Safi

  2. Hi Anju, your efforts are priceless, Keep going.
    Please let me know ‘ Why we keep one spfile for the cluster ?’, not one specific for each instance.

    1. Thanks Sumit.

      If you want instance specific parameter file, you can use client-side parameter files (PFILEs) but in this case Oracle does not preserve parameter changes across instance shutdown / startup. If you are using spfile, all instances in the cluster database muat use the same SPFILE .

      Regards
      Anju Garg

Your comments and suggestions are welcome!