11g R2RAC Dynamic remastering

In this post, I will demonstrate dynamic remastering of the resources in RAC .
In RAC, every data block is mastered by an instance. Mastering a block simply means that master instance keeps track of the state of the block until the next reconfiguration event .When one instance departs the cluster, the GRD portion of that instance needs to be redistributed to the surviving nodes. Similarly, when a new instance enters the cluster, the GRD portions of the existing instances must be redistributed to create the GRD portion of the new instance. This is called dynamic resource  reconfiguration.
In addition to dynamic resource reconfiguration, This is called dynamic remastering. The basic idea is to master a buffer cache resource on the instance where it is mostly accessed. In order to determine whether dynamic remastering is necessary, the GCS essentially keeps track of the number of GCS requests on a per-instance and per-object basis. This means that if an instance, compared to another, is heavily accessing blocks from the same object, the GCS can take the decision to dynamically migrate all of that object’s resources to the instance that is accessing the object most. LMON, LMD and LMS processes are responsible for Dynamic remastering.
– Remastering can be triggered as result of
    – Manual remastering
    – Resource affinity
    – Instance crash
– CURRENT SCENARIO -
- Database version : 11.2.0.1
- 3 node setup
- name of the database – racdb
— SETUP –
– Get data_object_id for scott.emp
SYS>  col owner for a10
            col data_object_id for 9999999 
            col object_name for a15 
            select owner, data_object_id, object_name 
           from dba_objects 
           where owner = 'SCOTT' 
             and object_name = 'EMP';
OWNER      DATA_OBJECT_ID OBJECT_NAME
———- ————– —————
SCOTT               73181 EMP
 – Get File_id and block_id of emp table
SQL>select empno, dbms_rowid.rowid_relative_fno(rowid), 
                  dbms_rowid.rowid_block_number(rowid) 
          from scott.emp 
           where empno in (7788, 7369);
     EMPNO DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
———- ———————————— ————————————
      7369                                    4                                  151
      7788                                    4                                  151
– MANUAL REMASTERING –
You can manually remaster an object with oradebug command :
oradebug lkdebug -m pkey <data_object_id>
– NODE1 – shutdown the database and restart
[oracle@host01 ~]$ srvctl stop database -d racdb 
                  srvctl start database -d racdb
                  srvctl status database -d racdb
– Issue a select on the object from NODE2
SCOTT@NODE2> select * from  emp;

 

– Find the GCS resource name to be used in  the query
   x$kjbl.kjblname = resource name in hexadecimal format([id1],[id2],[type]
   x$kjbl.kjblname2 = resource name in decimal format
   Hexname will be used to query resource in V$gc_element and v$dlm_rss views
get_resource_name
SYS@NODE2>col hexname for a25 
             col resource_name for a15 
             select b.kjblname hexname, b.kjblname2 resource_name, 
                     b.kjblgrant, b.kjblrole, b.kjblrequest  
           from x$le a, x$kjbl b 
             where a.le_kjbl=b.kjbllockp 
              and a.le_addr = (select le_addr 
                                from x$bh 
                               where dbablk = 151 
                                and obj    = 73181 
                               and class  = 1 
                                and state   <> 3);
HEXNAME                   RESOURCE_NAME   KJBLGRANT   KJBLROLE KJBLREQUE
————————- ————— ——— ———- ———
[0x97][0x4],[BL]          151,4,BL        KJUSERPR           0 KJUSERNL
– Check the current master of the block –
– Note that current master of scott.emp is node1 (numbering starts from 0)
– Previous master = 32767  is a place holder indicating that prior master
   was not known, meaning first remastering of that object.hat index happened.
   Now the master is 0 which is instance 1.
– REMASTER_CNT = 1 indicating the object has been remastered only once
SYS>select o.object_name, m.CURRENT_MASTER, 
                   m.PREVIOUS_MASTER, m.REMASTER_CNT 
          from   dba_objects o, v$gcspfmaster_info m
           where o.data_object_id=73181
           and m.data_object_id = 73181 ;
OBJECT CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
—— ————– ————— ————
EMP                 0           32767            1
–  Use following SQL to show master and owner of the block.
 This SQL joins   x$kjbl with x$le to retrieve resource name.
– Note that current master is node1(KJBLMASTER=0) and current owner of the block is
node2(KJBLOWNER = 1)
SYS@NODE2> select kj.kjblname, kj.kjblname2, kj.kjblowner, 
                       kj.kjblmaster
            from (select kjblname, kjblname2, kjblowner, 
                         kjblmaster, kjbllockp         
                  from x$kjbl
                   where kjblname = '[0x97][0x4],[BL]'
                  ) kj, x$le le
            where le.le_kjbl = kj.kjbllockp
            order by le.le_addr;
KJBLNAME                       KJBLNAME2                       KJBLOWNER  KJBLMASTER
—————————— —————————— ———-  ———-
[0x97][0x4],[BL]               151,4,BL                                1     0
– Manually master the EMP table to node2 –
SYS@NODE2>oradebug lkdebug -m pkey 74625
– Check that the current master of the block has changed to node2 (numbering starts from 0)
– Previous master = 0 (Node1)
– REMASTER_CNT = 2 indicating the object has been remastered twice
SYS>select o.object_name, m.CURRENT_MASTER, 
                   m.PREVIOUS_MASTER, m.REMASTER_CNT 
          from   dba_objects o, v$gcspfmaster_info m 
           where o.data_object_id=74625
            and m.data_object_id = 74625 ;
OBJECT CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
—— ————– ————— ————
EMP                 1               0            2
–  Find master and owner of the block. 
– Note that current owner of the block is Node2 (KJBLOWNER=1)
   from where query was issued)
– current master of the block has been changed to node2 (KJBLMASTER=1)
SYS> select kj.kjblname, kj.kjblname2, kj.kjblowner, 
             kj.kjblmaster 
           from (select kjblname, kjblname2, kjblowner, 
                         kjblmaster, kjbllockp 
                 from x$kjbl
                  where kjblname = '[0x97][0x4],[BL]'                                ) kj, x$le le 
           where le.le_kjbl = kj.kjbllockp   
           order by le.le_addr;
KJBLNAME                       KJBLNAME2                       KJBLOWNER KJBLMASTER
—————————— —————————— ———-  ———-
[0x97][0x4],[BL]               151,4,BL                                1  1
—————————————————————————————
– REMASTERING DUE TO RESOURCE AFFINITY –

GCS masters a buffer cache resource on the instance where it is mostly accessed. In order to determine whether dynamic remastering is necessary, the GCS essentially keeps track of the number of GCS requests on a per-instance and per-object basis. This means that if an instance, compared to another, is heavily accessing blocks from the same object, the GCS can take the decision to dynamically migrate all of that object’s resources to the instance that is accessing the object most.
X$object_policy_statistics maintains the statistics about objects and OPENs
on those objects.LCK0 process maintains these object affinity statistics.
Following parameters affect dynamic remastering due to resource affinity :
_gc_policy_limit : If an instance opens 50 more opens on an object then the other instance (controlled by _gc_policy_limit parameter), then that object is a candidate for remastering. That object is queued and LMD0 reads the queue and initiates GRD freeze. LMON performs reconfiguration of buffer cache locks working with LMS processes. All these are visible in LMD0/LMON trace files.
_gc_policy_time : It controls how often the queue is checked to see if the remastering must be triggered or not with a default value of 10 minutes.
_gc_policy_minimum: This parameter is defined as “minimum amount of dynamic affinity activity per minute” to be a candidate for remastering. Defaults to 2500 and I think, it is lower in a busy environment.
To disable DRM completely, set _gc_policy_limit and _gc_policy_minimum to much higher value, say 10Million.  Setting the parameter _gc_policy_time to 0 will completely disable DRM, but that also means that you can not manually remaster objects. Further, $object_policy_statistics is not maintained if DRM is disabled.
— SETUP  –-
SYS>drop table scott.test purge; 
     create table scott.test as select * from sh.sales; 
     insert into scott.test select * from scott.test; 
    commit; 
     insert into scott.test select * from scott.test; 
     commit; 
    insert into scott.test select * from scott.test; 
     commit; 
     insert into scott.test select * from scott.test; 
     commit;
– Get data_object_id for scott.test
SYS> col data_object_id for 9999999 
         col object_name for a15 
         select owner, data_object_id, object_name, object_id  
         from dba_objects 
         where owner = 'SCOTT' 
           and object_name = 'TEST';
OWNER                          DATA_OBJECT_ID OBJECT_NAME      OBJECT_ID
—————————— ————– ————— ———-
SCOTT                                   74626 TEST                 74626
– Check the initial values of the parameters _gc_policy_minimum and _gc_policy_time
– Enter name of the parameter when prompted
SYS> 
 SET linesize 235 
 col Parameter FOR a20 
 col Instance FOR a10 
 col Description FOR a40 word_wrapped 

 SELECT a.ksppinm  "Parameter", 
       c.ksppstvl "Instance", 
        a.ksppdesc "Description" 
 FROM x$ksppi a, x$ksppcv b, x$ksppsv c, v$parameter p 
 WHERE a.indx = b.indx AND a.indx = c.indx 
   AND p.name(+) = a.ksppinm 
   AND UPPER(a.ksppinm) LIKE UPPER('%&parameter%') 
 ORDER BY a.ksppinm; 

 Enter value for parameter: gc_policy 
 old  11:   AND UPPER(a.ksppinm) LIKE UPPER('%&parameter%') 
 new  11:   AND UPPER(a.ksppinm) LIKE UPPER('%gc_policy%')
Parameter            Instance   Description
——————– ———- —————————————-
_gc_policy_minimum   1500       dynamic object policy minimum activity
                                per minute
_gc_policy_time      10         how often to make object policy
                                decisions in minutes
– Set _gc_policy_minimum and _gc_policy_time to very small values
   so that we can demonstrate remastering
SYS>alter system set "_gc_policy_minimum" = 10 scope=spfile; 
          alter system set "_gc_policy_time" = 1 scope=spfile;
– NODE1 – shutdown the database and restart
[oracle@host01 ~]$ srvctl stop database -d racdb 
                   srvctl start database -d racdb 
                   srvctl status database -d racdb
– Check that parameter values have been changed to the minimum
   allowed by oracle although these values are not the ones we specified
– Enter name of the parameter when prompted
SYS>
SET linesize 235

col Parameter FOR a20

col Instance FOR a10

col Description FOR a40 word_wrapped

SELECT a.ksppinm  "Parameter", c.ksppstvl "Instance",       a.ksppdesc "Description" 
FROM x$ksppi a, x$ksppcv b, x$ksppsv c, v$parameter p 
WHERE a.indx = b.indx 
AND a.indx = c.indx   
AND p.name(+) = a.ksppinm   
AND UPPER(a.ksppinm) LIKE UPPER('%&parameter%') 
ORDER BY a.ksppinm; 

old  11:   AND UPPER(a.ksppinm) LIKE UPPER('%&parameter%')
new  11:   AND UPPER(a.ksppinm) LIKE UPPER('%gc_policy%')
Enter value for parameter: gc_policy
Parameter            Instance   Description
——————– ———- —————————————-
_gc_policy_minimum   20         dynamic object policy minimum activity
                                per minute
_gc_policy_time      4          how often to make object policy
                                decisions in minutes
- Assign TEST to node1 manually
– Issue a select on  scott.test from node1 –
SYS@NODE1>oradebug lkdebug -m pkey 74626 
     SCOTT@NODE1>select * from scott.test;
– check the current master of scott.test –
– Note that current master of scott.test is node1 (numbering starts from 0)
– Previous master = 2 (node3)
– REMASTER_CNT = 3 because while I was doing this demonstartion, remastering   was initiated 2 times earlier also.
SYS@NODE1>select o.object_name, m.CURRENT_MASTER, 
                         m.PREVIOUS_MASTER, m.REMASTER_CNT 
                  from   dba_objects o, v$gcspfmaster_info m 
                  where o.data_object_id=74626 
                   and m.data_object_id = 74626 ;
OBJECT_NAME     CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
————— ————– ————— ————
TEST                         0               2            3
– Issue an insert statement on scott.test from node3 so that scott.test will be remastered to node3
SCOTT@NODE3>insert into scott.test select * from test;
– check repeatedly that opens are increasing on scott.test with time
SYS@NODE1>select inst_id, sopens, xopens 
           from x$object_policy_statistics 
           where object=74626;
 INST_ID     SOPENS     XOPENS
———- ———- ———-
         1       3664          0
SYS@NODE1>/
   INST_ID     SOPENS     XOPENS
———- ———- ———-
         1       7585       1305
            .
            .
            .
SYS@NODE1>/
   INST_ID     SOPENS     XOPENS
———- ———- ———-
         1      12788      17000
SYS@NODE1>/
   INST_ID     SOPENS     XOPENS
———- ———- ———-
         1      35052      39297
– check repeatedly if remastering has been initiated —
— Note that 
 after some time
    . current master changes from node1CURRENT_MASTER =0) to node3 (CURRENT_MASTER =2)
    . Previous master changes from node3 ( PREVIOUS_MASTER=2) to node1( PREVIOUS_MASTER=0)
    – Remaster count increases from 3 to 4.
    .
SYS@NODE2>select o.object_name, m.CURRENT_MASTER, 
                         m.PREVIOUS_MASTER, m.REMASTER_CNT 
          from   dba_objects o, v$gcspfmaster_info m
           where o.data_object_id=74626 
             and m.data_object_id = 74626 ;
16:09:16 SYS@NODE2>/
OBJECT_NAME
 OBJECT_NAME  CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
—————–      ————– ————— ————
TEST                                             0                        2                                     3
                        .
                        .
                        .
                        .
16:12:24 SYS@NODE2>/
OBJECT_NAME CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
——————————————————————————–
TEST                                                 2                     0                            4
—- REMASTERING DUE TO INSTANCE CRASH –
Presently node3 is the master of SCOTT.TEST
Let us crash node3 and monitor the remastering process
root@node3#init 6
– check repeatedly if remastering has been initiated –
– Note that scott.test has been remastered to node2 (CURRENT_MASTER=1)
– PREVIOUS_MASTER =2 and REMASTER_CNT has increased from 4 to 5
SYS@NODE2>select o.object_name, m.CURRENT_MASTER, 
                          m.PREVIOUS_MASTER, m.REMASTER_CNT
                   from   dba_objects o, v$gcspfmaster_info m 
                   where o.data_object_id=74626 
                   and m.data_object_id = 74626 ;
OBJECT_NAME     CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
————— ————– ————— ————
TEST                         1               2            5
— CLEANUP —
SYS@NODE1>drop table scott.test purge; 
 SYa@NODE1S>
     alter system reset "_gc_policy_minimum" = 10 scope=spfile; 
     alter system reset "_gc_policy_time" = 1 scope=spfile; 

 [oracle@host01 ~]$ srvctl stop database -d racdb 
                    srvctl start database -d racdb 
                    srvctl status database -d racdb
——————————————————————————————————
Related Links:
 

 

7 thoughts on “11g R2RAC Dynamic remastering

  1. Hi,
    Thanks for sharing this information. I was wondering if you came across a method to estimate how long a reconfiguration would take post shutdown of a RAC instance?

    Thanks,

    Chris.

    1. Thanks Chris for your time.

      I am sorry that I have not come across a method to estimate how long a reconfiguration would take post shutdown of a RAC instance.

      Regards
      Anju GArg

  2. Thanks Anju. Do you have a query which calculates how many blocks per instance would need remastering in the event of an instance crash?

    Thanks,

    Chris.

  3. Thanks Anju, very nice article and good scripts. Was able to create simulation in one of my in-house environment with some additional tweaks. Mind if I use the scripts in my blog :-)?

  4. Hi Maam,
    One query regarding GRD. lets say there are 4 nodes and each node is having 1/4th of GRD data.

    Suppose block A, block B is mastered by node 1. And now if node 1 is crashed, how other nodes will know about the resources held by Node 1.?

Your comments and suggestions are welcome!