In this post, I will demonstrate three methods to identify the oracle clusterware’s master node. Pls note that clusterware master is different from Resource master in oracle database instance. To know about how to find the resource master, pls click here:.
Importance of master node in a cluster:
- Master node has the least Node-id in the cluster. Node-ids are assigned to the nodes in the same order as the nodes join the cluster. Hence, normally the node which joins the cluster first is the master node.
Importance of master node in a cluster:
- Master node has the least Node-id in the cluster. Node-ids are assigned to the nodes in the same order as the nodes join the cluster. Hence, normally the node which joins the cluster first is the master node.
- - CRSd process on the Master node is responsible to initiate the OCR backup as per the backup policy
- - Master node is also responsible to sync OCR cache across the nodes
- - CRSd process oth the master node reads from and writes to OCR on disk
- - In case of node eviction, The cluster is divided into two sub-clusters. The sub-cluster containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of nodes, the sub-cluster having the master node survives whereas the other sub-cluster is evicted.
Oracle ClusterWare master’s information can be found
- - by scanning ocssd logs from various nodes
- - by scanning crsd logs from various nodes.
- - by identifying the node which takes the backup of the OCR.
If master node gets evicted/rebooted, another node becomes the master.
I have a 3 node setup. I check the ocssd logs on the 3 nodes for the string ‘master node’ and note that node 3 is the master node.
[grid@host01 root]$ cat $ORACLE_HOME/log/host01/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:36.949: [ CSSD][2778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 1, master node number 3
[root@host02 cssd]# cat $ORACLE_HOME/log/host02/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:36.953: [ CSSD][778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 2, master node number 3
[root@host03 ~]# cat $ORACLE_HOME/log/host03/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-23 10:14:37.001: [ CSSD][778700688]clssgmCMReconfig: reconfiguration successful, incarnation 248954981 with 3 nodes, local node number 3, master node number 3
If I take the OCR backup right now, it will be taken by node3 (master node).
[root@host02 cssd]# ocrconfig -manualbackup
host03 2012/11/24 09:54:48 /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_095448.ocr
Let us check crsd logs of various nodes, looking for the string OCR MASTER.Note that node3 is the master node presently.
[grid@host01 crsd]$ cat /u01/app/11.2.0/grid/log/host01/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-23 10:15:01.403: [ OCRMAS][2877356944]th_master: NEW OCR MASTER IS 3
[root@host02 crsd]# cat /u01/app/11.2.0/grid/log/host02/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-23 10:15:03.561: [ OCRMAS][876976016]th_master: NEW OCR MASTER IS 3
[root@host03 crsd]# cat /u01/app/11.2.0/grid/log/host03/crsd/crsd.log |grep ‘OCR MASTER’ | tail -3
2012-11-23 10:11:18.499: [ OCRMAS][877467536]th_master:13: I AM THE NEW OCR MASTER at incar 44. Node Number 3
[
Let me reboot node3 and check which node is assigned the mastership now.
[root@host03 ~]# init 6
check the ocssd logs on the remaining two nodes (node1 and node2) for the string ‘master node’ and note that node 1 is the master node.
[grid@host01 root]$ cat $ORACLE_HOME/log/host01/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-24 10:09:23.522: [ CSSD][2778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954982 with 2 nodes, local node number 1, master node number 1
[root@host02 cssd]# cat $ORACLE_HOME/log/host02/cssd/ocssd.log |grep ‘master node’ |tail -1
2012-11-24 10:09:23.502: [ CSSD][778696592]clssgmCMReconfig: reconfiguration successful, incarnation 248954982 with 2 nodes, local node number 2, master node number 1
As can be seen from ocssd logs of the remaining two nodes, node1 has become the master now.
Now If I take the OCR backup, it is taken by node1 while earlier backups were taken by node3 which was the then master.
[root@host02 cssd]# ocrconfig -manualbackup
host01 2012/11/24 10:12:29 /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_101229.ocr
host03 2012/11/24 09:54:48 /u01/app/11.2.0/grid/cdata/cluster01/backup_20121124_095448.ocr
Let us check crsd logs of various nodes, looking for the string OCR MASTER
[grid@host01 crsd]$ cat /u01/app/11.2.0/grid/log/host01/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:08:45.884: [ OCRMAS][877356944]th_master:13: I AM THE NEW OCR MASTER at incar 47. Node Number 1
[root@host02 crsd]# cat /u01/app/11.2.0/grid/log/host02/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:08:45.364: [ OCRMAS][876976016]th_master: NEW OCR MASTER IS 1
[root@host03 crsd]# cat /u01/app/11.2.0/grid/log/host03/crsd/crsd.log |grep ‘OCR MASTER’ | tail -1
2012-11-24 10:12:20.282: [ OCRMAS][877422480]th_master: NEW OCR MASTER IS 1
Hope you enjoyed reading !!
Keep visiting the blog ….
Regards
——————————————————————————————–Related links:
——————————————————————————————–Related links:
11g R2 RAC: How To Find The Resource Master?
——————
Nice one.
–Jamsher
Hi Anju,
I frequently read your Blog. its really interesting and helpful. I have a doubt about above said desciption i.e.” In case of node eviction, The cluster is divided into two sub-clusters. The sub-cluster containing fewer no. of nodes is evicetd. But, in case both the sub-clusters have same no. of nodes, the sub-cluster having the master node survives whereas the other sub-cluster is evicted.”
Where as my understanding is “Odd number of disk are to avoid split brain, When Nodes in cluster can’t talk to each other they run to lock the Voting disk and whoever lock the more disk will survive, if disk number are even there are chances that node might lock 50% of disk (2 out of 4) then how to decide which node to evict.
whereas when number is odd, one will be higher than other and each for cluster to evict the node with less number”
Can you please clarify my doubt as both are contradicting?
Thanks & Regards
Syed Safi
It’s eviction high node number ,keep smaller node number node.
you can use olsnodes -n find node number.
Hi Anju, your efforts are priceless, Keep going.
Please let me know ‘ Why we keep one spfile for the cluster ?’, not one specific for each instance.
Thanks Sumit.
If you want instance specific parameter file, you can use client-side parameter files (PFILEs) but in this case Oracle does not preserve parameter changes across instance shutdown / startup. If you are using spfile, all instances in the cluster database muat use the same SPFILE .
Regards
Anju Garg
A ver y informative sight and very well written articles on all aspects of Oracle DB .
Thank you very much for the information shared… God bless you my friend.
Thanks Ajay!
Your comments and suggestions are always welcome!
Regards
Anju Garg
Nice blog
Classic explained all things,thanks anju.
Hi Akash
Thanks for your time and feedback. Your comments and suggestions are always welcome!
regards
Anju Garg
super nice example
Thanks for your time and feedback.
Your comments and suggestions are always welcome.
regards
Anju Garg