Monthly Archives: November 2012

11g R2 RAC : DELETE A NODE

November 10, 201211g R2 RAC, UncategorizedAnju Garg

11g R2 RAC has extremely simplified the process of deleing a node to the cluster as compared to 10g RAC or 11g R1 RAC. To know the procedure to delete a node in 10g RAC or 11g R1 RAC pls click here.

Now, after the introduction of SCAN and GPNP, we need to follow very simple steps.

Current scenario:

I have three nodes in the cluster presently.

Host names :

- host01.example.com

- host02.example.com

- host03.example.com

Node to be deleted :

- host03.example.com

Procedure:

– Login as root on the node(s) to be deleted(host03) navigate to <gridhome>/crs/install

[root@host03]#cd /u01/app/11.2.0/grid/crs/install

– Disable oracle clusterware applications and applications on the node (host03)

[root@host03]#./rootcrs.pl -deconfig -force

– From a node that will remain(host01) , delete the node

root@host01]#crsctl delete node -n host03

– on a node that will remain , login as grid user, navigate to gridhome/oui/bin and update node list

[grid@host01]$cd /u01/app/11.2.0/grid/oui/bin

[grid@host0$]./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid “CLUSTER_NODES={host01,host02}”

– check that only nodes 1 and 2 remain

$crsctl stat res -t

References:

http://www.idevelopment.info/data/Oracle/DBA_tips/Oracle11gRAC/CLUSTER_24.shtml

—————————————————-

Related links:

Home

11gR2 RAC Index

11g R2 RAC: Add A Node

11g R2 RAC: Add Back Deleted Node

11g R2 RAC: Clone Database Home

11g R2 RAC: Convert Non RAC Database To RAC Database Using Rconfig

Add Instance Manually

——————

11g R2 RAC: ADD A NODE

November 10, 201211g R2 RAC, UncategorizedAnju Garg

11g R2 RAC has extremely simplified the process of adding a node to the cluster as compared to 10g RAC or 11g R1 RAC. To know the procedure to add a node in 10g RAC or 11g R1 RAC pls visit click here.

Now, after the introduction of SCAN and GPNP, we need to follow very simple steps.

Current scenario:

I have two nodes in the cluster presently.

Host names :

- host01.example.com

- host02.example.com

Node to be added :

- host03.example.com

Procedure:

————————————

Prepare the machine for third node

————————————

— Set kernel parameters

— Install required rpm’s

— Create users/groups

— configure oracleasm

root@host03#oracleasm configure -i

oracleasm exit

oracleasm init

oracleasm scandisks

oracleasm listdisks

all asm disks will be listed

– Configure ssh connectivity for grid user among all 3 nodes –

– On node3 as grid user

[grid@host03 .ssh]$ssh-keygen -t rsa

ssh-keygen -t dsa

cd /home/grid/.ssh

cat *.pub > host03

scp host03 host01:/home/grid/.ssh/

[grid@host03 .ssh] $ssh host01

– Enter password

[grid@host01 ~]$cd /home/grid/.ssh

cat host03 >> authorized_keys

scp authorized_keys host02:/home/grid/.ssh/

scp authorized_keys host03:/home/grid/.ssh/

– Test ssh connectivity on all 3 nodes as grid user –

– run following on all 3 nodes twice as grid user–

echo ssh host01 hostname >> a.sh

echo ssh host02 hostname >> a.sh

echo ssh host03 hostname >> a.sh

echo ssh host01-priv hostname >> a.sh

echo ssh host02-priv hostname >> a.sh

echo ssh host03-priv hostname >> a.sh

chmod +x a.sh

./a.sh

– Run cluster verify to check that host03 can be added as node –

grid host01# cluvfy stage -pre crsinst -n host03 -verbose

– if time synchronization problem, restart ntpd service on each node

– Error grid is not a member of dba group – ignore

grid@host01 ~]$. oraenv –+ASM1

[grid@host01 ~]$ cd /u01/app/11.2.0/grid/oui/bin

– Add node

[grid@host01 bin]$./addNode.sh -silent “CLUSTER_NEW_NODES={host03}” “CLUSTER_NEW_VIRTUAL_HOSTNAMES={host03-vip}”

– Execute oraInstroot.sh and root.sh on node3 as root –

[root@host03]#/u01/app/oraInventory/oraInstroot.sh

/u01/app/11.2.0/grid/root.sh

– check from host01 that node has been added –

host01 grid > crsctl stat res -t

– Start any resources if they are not up already –

host01 grid > crsctl start resource <resource name>

After you have added a node, in case you want to clone database home on that node, please click here.

References:

http://www.idevelopment.info/data/Oracle/DBA_tips/Oracle11gRAC/CLUSTER_22.shtml

—————————————————————————-

Related links:

Home

11gR2 RAC Index

11g R2 RAC: Add Back Deleted Node

11g R2 RAC: Clone Database Home

11g R2 RAC: Convert Non RAC Database To RAC Database Using Rconfig

11g R2 RAC : Delete A Node

Add Instance Manually

———————–

11g R2 RAC: CLONE DATABASE HOME

November 9, 201211g R2 RAC, UncategorizedAnju Garg

In this article, I am going to discuss the procedure to clone a database home in RAC.

CURRENT SCENARIO:

- Two node RAC setup.
- Both the current nodes have non shared database home.
- Current two node setup is not able to handle increased workload.
- Want to add third node to the setup.

Need to

. Install oracle clusterware on the third node
. Install database software on the third node (Since database home is non shared).

Here is the detailed procedure which needs to be implemented:

————————————–

Preparing to Clone Oracle RAC

————————————–

On one of the existing nodes (node1 say):

Step 1 Create a backup of the source home

Create a copy of the Oracle RAC home. You will use this file to copy the Oracle RAC home to each node in the cluster

When creating the backup (tar) file, the best practice is to include the release number in the name of the file. For example:

– Go to ORACLE_HOME –

[root@node1 root]# cd /u01/app/oracle/product/11.2.0/dbhome_1

– create the backup(tar) file in /home/oracle with name db1120.tgz

[root@node1 dbhome_1]# tar -zcvf /home/oracle/db1120.tgz .

Step 2 Install and start Oracle Clusterware (Use addnode.sh to add new node in existing cluster)

Before you can use cloning to create a new Oracle RAC home, you must first install and start Oracle Clusterware on the node on which you want to copy a cloned Oracle RAC home. In other words, you configure an Oracle RAC home that you cloned from a source cluster onto the nodes in a target cluster in the same order that you installed the Oracle Clusterware and Oracle RAC software components on the original nodes.

————————————————

Deploying Oracle RAC Database Homes

————————————————

Deploying the Oracle RAC database home to a cluster is a multiple-step process.

This section provides step-by-step instructions that describe how to:

Prepare the new cluster node
Deploy the Oracle RAC database softwar
Run the clone.pl script on each node
Run the $ORACLE_HOME/root.sh script on each node

Step 1 Prepare the new cluster nodes

Perform the Oracle RAC preinstallation steps, including such things as:

· Specify the kernel parameters.
· Use short, nondomain-qualified names for all names in the Hosts file.
· Verify that you can ping the public and interconnect names.
· Ensure Oracle Clusterware is active.
· Ensure that Oracle ASM is active and there at least one Oracle ASM disk group exists and is mounted.

See your platform-specific Oracle RAC installation guide for a complete preinstallation checklist.

Step 2 Deploy the Oracle RAC database software on third node

To deploy the Oracle RAC software, you need to:

1. Restore the Oracle home to the third node.

[root@node1 root]# scp /home/oracle/db1120.tgz node3:/home/oracle/db1120.tgz

2. create the directory for oracle home

[root@node3 root]# mkdir -p /u01/app/oracle/product/11.2.0/dbhome_1

3. [root@node3 root]# cd /u01/app/oracle/product/11.2.0/dbhome_1

4. Extract the contents of the tar file

[root@node3 dbhome_1]# tar -zxvf /home/oracle/db1120.tgz

When providing the home location and path_name, the home location can be in the same directory path or in a different directory path from the source home that you used to create the tar.

5. Change the ownership of all files to the oracle and oinstall group. For example:

[root@node3 dbhome_1]# chown -R oracle:oinstall /u01/app/oracle/product/11.2.0/dbhome_1

Step 3 Create a script file start.sh as oracle user on node3 in folder $ORACLE_HOME/clone/bin with following environment variables.

[oracle@node3 dbhome_1]$cd $ORACLE_HOME/clone/bin

vi start.sh

ORACLE_BASE=/u01/app/oracle

ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1

cd $ORACLE_HOME/clone

THISNODE=`hostname -s`

E01=ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1

E02=ORACLE_HOME_NAME=OraDBRAC

E03=ORACLE_BASE=/u01/app/oracle

C01=”-O’CLUSTER_NODES={node1, node2, node3}'”

C02=”-O’LOCAL_NODE=$THISNODE'”

perl $ORACLE_HOME/clone/bin/clone.pl $E01 $E02 $E03 $C01 $C02

THISNODE should be set to the name of the node to be added e.g. node3

Step 4 Run start.sh script as oracle user on node3(owner of database home) from one of existing cluster node where we have database binaries

. This script will copy database binaries from existing node to the node specified in THISNODE variable as well as update inventories on all cluster nodes.

oracle@node3 bin]$chmod u+x start.sh

./start.she

Step 5 Run the $ORACLE_HOME/root.sh script on each node

- check the contents of the inventory.xml file, we will see that it has information about the 11g Oracle Home

[oracle@node3 dbhome_1]$cd /u01/app/oracle/oraInventory/ContentsXML

vi inventory.xml

- check the contents of the inventory.xml file on the other nodes. If entry for newly added node for database home does not exist, add it manually

[oracle@node1 dbhome_1]$vi /u01/app/oracle/oraInventory/ContentsXML/inventory.xml

[oracle@node2 dbhome_1]$vi /u01/app/oracle/oraInventory/ContentsXML/inventory.xml

———–

Related links:

Home

11gR2 RAC Index

11g R2 RAC: Add A Node

11g R2 RAC: Add Back Deleted Node

11g R2 RAC: Convert Non RAC Database To RAC Database Using Rconfig

11g R2 RAC : Delete A Node

Add Instance Manually

————–

—————–

11g R2 RAC : NODE EVICTION DUE TO MEMBER KILL ESCALATION

November 9, 201211g R2 RAC, UncategorizedAnju Garg

If the Oracle Clusterware itself is working perfectly but one of the RAC instances is hanging , the database LMON process will request a member kill escalation and ask the CSS process to remove the hanging database instance from the cluster.

The following example will demonstrate it in a cluster consisting of two nodes:

SQL> select instance_name, host_name from gv$instance;

SQL> col host_name for a20

select instance_name, host_name from gv$instance;

INSTANCE_NAME HOST_NAME

—————- ——————–

orcl1 host01.example.com

orcl2 host02.example.com

- On host02 server stop the execution of all rdbms processes (by sending the STOP signal)

– Find out current database processes

[root@host02 ~]# ps -ef | grep ora_ | grep orcl2

oracle 6215 1 0 11:20 ? 00:00:00 ora_pmon_orcl2

oracle 6217 1 0 11:20 ? 00:00:00 ora_vktm_orcl2

oracle 6221 1 0 11:20 ? 00:00:00 ora_gen0_orcl2

oracle 6223 1 0 11:20 ? 00:00:00 ora_diag_orcl2

oracle 6225 1 0 11:20 ? 00:00:00 ora_dbrm_orcl2

oracle 6227 1 0 11:20 ? 00:00:00 ora_ping_orcl2

oracle 6229 1 0 11:20 ? 00:00:00 ora_psp0_orcl2

oracle 6231 1 0 11:20 ? 00:00:00 ora_acms_orcl2

oracle 6233 1 0 11:20 ? 00:00:00 ora_dia0_orcl2

oracle 6235 1 0 11:20 ? 00:00:00 ora_lmon_orcl2

oracle 6237 1 0 11:20 ? 00:00:02 ora_lmd0_orcl2

……

– stop the execution of all rdbms processes (by sending the STOP signal)

[root@host02 ~]# ps -ef | grep ora_ | grep orcl2 | awk ‘{print $2}’ | while read PID

kill -STOP $PID

done

–. From the client point of view the Real Application Cluster database is hanging on both nodes. No queries or DMLs are possible. Try to execute a query. The query will hang.

SQL> select instance_name, host_name from gv$instance;

– no output, query hangs …

– . Due to missing heartbeats the healthy RAC instance on node host01 will remove the hanging RAC instance by requesting a member kill escalation.

– Check the database alert log file on host01 : LMS process issues a request to CSSD to reboot the node.

The node is evicted and instance is restarted after node joins the cluster.

[root@host01 trace]# tailf /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/alert_orcl1.log

LMS0 (ospid: 31771) has detected no messaging activity from instance 2

LMS0 (ospid: 31771) issues an IMR to resolve the situation

Please check LMS0 trace file for more detail.

Fri Nov 09 11:15:04 2012

Remote instance kill is issued with system inc 30

Remote instance kill map (size 1) : 2

LMON received an instance eviction notification from instance 1

The instance eviction reason is 0x20000000

The instance eviction map is 2

Fri Nov 09 11:15:13 2012

IPC Send timeout detected. Sender: ospid 6308 [oracle@host01.example.com (PZ97)]

Receiver: inst 2 binc 429420846 ospid 6251

Waiting for instances to leave:

Reconfiguration started (old inc 4, new inc 8)

List of instances:

1 (myinst: 1)

….. Recovery of instance 2 starts

Global Resource Directory frozen

….

All grantable enqueues granted

Post SMON to start 1st pass IR

—-

Instance recovery: looking for dead threads

Beginning instance recovery of 1 threads

Started redo scan

IPC Send timeout to 2.0 inc 4 for msg type 12 from opid 42

Completed redo scan

read 93 KB redo, 55 data blocks need recovery

Started redo application at

Thread 2: logseq 9, block 42

Recovery of Online Redo Log: Thread 2 Group 3 Seq 9 Reading mem 0

Mem# 0: +DATA/orcl/onlinelog/group_3.266.798828557

Mem# 1: +FRA/orcl/onlinelog/group_3.259.798828561

Completed redo application of 0.05MB

Completed instance recovery at

Thread 2: logseq 9, block 228, scn 1069404

52 data blocks read, 90 data blocks written, 93 redo k-bytes read

Thread 2 advanced to log sequence 10 (thread recovery)

Fri Nov 09 12:18:55 2012

….

— Check the cluster clusterware alert log of host01 –

– The node is evicted and rebooted to join the cluster

[grid@host01 host01]$ tailf /u01/app/11.2.0/grid/log/host01/alerthost01.log

[cssd(14493)]CRS-1607:Node host02 is being evicted in cluster incarnation 247848838; details at (:CSSNM00007:) in

/u01/app/11.2.0/grid/log/host01/cssd/ocssd.log.

2012-11-09 11:15:56.140

[ohasd(12412)]CRS-8011:reboot advisory message from host: host02, component: mo103324, with time stamp: L-2012-11-09-

11:15:56.580

[ohasd(12412)]CRS-8013:reboot advisory message text: clsnomon_status: need to reboot, unexpected failure 8 received from

CSS

2012-11-09 11:16:17.365

[cssd(14493)]CRS-1601:CSSD Reconfiguration complete. Active nodes are host01 .

2012-11-09 11:16:17.400

[crsd(14820)]CRS-5504:Node down event reported for node ‘host02‘.

…… Node 2 joins the cluster

[cssd(14493)]CRS-1601:CSSD Reconfiguration complete. Active nodes are host01 host02 .

2012-11-09 12:18:52.713

[crsd(14820)]CRS-2772:Server ‘host02′ has been assigned to pool ‘Generic’.

2012-11-09 12:18:52.713

[crsd(14820)]CRS-2772:Server ‘host02′ has been assigned to pool ‘ora.orcl’.

7. After the node rejoins the cluster and the instance is restarted, reexecute the query – it succeeds

SQL> conn sys/oracle@orcl as sysdba

col host_name for a20

select instance_name, host_name from gv$instance;

INSTANCE_NAME HOST_NAME

—————- ——————–

orcl1 host01.example.com

orcl2 host02.example.com

References:

http://www.unbreakablecloud.com/wordpress/2010/11/02/understanding-cluster-node-eviction/

——————————————————————————————————-

11g R2 RAC: NODE EVICTION DUE TO CSSDAGENT STOPPING

November 9, 201211g R2 RAC, UncategorizedAnju Garg

In addition to the ocssd.bin process which is responsible, among other things, for the network

and disk heartbeats, Oracle Clusterware 11g Release 2 uses two new monitoring processes

cssdagent and cssdmonitor , which run with the highest real-time scheduler priority and are also

able to fence a server.

– Find out PID for cssdagent

[root@host02 lastgasp]# ps -ef |grep cssd |grep -v grep

root 5085 1 0 09:45 ? 00:00:00 /u01/app/11.2.0/grid/bin/cssdmonitor

root 5106 1 0 09:45 ? 00:00:00 /u01/app/11.2.0/grid/bin/cssdagent

grid 5136 1 0 09:45 ? 00:00:02 /u01/app/11.2.0/grid/bin/ocssd.bin

– Find out the scheduling priority of cssdagent

[root@host02 lastgasp]# chrt -p 5106

pid 5106’s current scheduling policy: SCHED_RR

pid 5106’s current scheduling priority: 99

Since cssdagent and cssdmonitor have schedulilng priority of 99 stopping them can reset a server in case :

• there is some problem with the ocssd.bin process

• there is some problem with OS scheduler

. CPU starvation

• OS is locked up in a driver or hardware (e.g. I/O call)

Both of them are also associated with an undocumented timeout. In case the execution of the

processes stops for more than 28 sec., the node will be evicted.

– Let us stop the execution of cssdagent for 40 secs

root@rac1 ~]# kill -STOP 5106; sleep 40; kill -CONT 5106

– check the alert log of host01 –

– Node2 is rebooted

[grid@host01 host01]$ tailf /u01/app/11.2.0/grid/log/host01/alerthost01.log

[ohasd(12412)]CRS-8011:reboot advisory message from host: host02, component: ag100946, with time stamp: L-2012-11-09-

10:21:28.040

[ohasd(12412)]CRS-8013:reboot advisory message text: Rebooting after limit 28100 exceeded; disk timeout 28100, network

timeout 27880, last heartbeat from CSSD at epoch seconds 352436647.013, 34280 milliseconds ago based on invariant clock

– Node 2 is rebooted and network connection with it breaks

value of 294678040

2012-11-09 10:21:45.671

[cssd(14493)]CRS-1612:Network communication with node host02 (2) missing for 50% of timeout interval. Removal of this node

from cluster in 14.330 seconds

2012-11-09 10:21:53.923

[cssd(14493)]CRS-1611:Network communication with node host02 (2) missing for 75% of timeout interval. Removal of this node

from cluster in 7.310 seconds

2012-11-09 10:21:59.845

[cssd(14493)]CRS-1610:Network communication with node host02 (2) missing for 90% of timeout interval. Removal of this node

from cluster in 2.300 seconds

2012-11-09 10:22:02.587

[cssd(14493)]CRS-1632:Node host02 is being removed from the cluster in cluster incarnation 247848834

2012-11-09 10:22:02.717

[cssd(14493)]CRS-1601:CSSD Reconfiguration complete. Active nodes are host01 .

2012-11-09 10:22:02.748

[crsd(14820)]CRS-5504:Node down event reported for node ‘host02′.

2012-11-09 10:22:10.086

[crsd(14820)]CRS-2773:Server ‘host02′ has been removed from pool ‘Generic’.

2012-11-09 10:22:10.086

[crsd(14820)]CRS-2773:Server ‘host02′ has been removed from pool ‘ora.orcl’.

References:

http://www.unbreakablecloud.com/wordpress/2010/11/02/understanding-cluster-node-eviction/

————————————————————————————————————————————–

Related links:

Home

11G R2 RAC Index

Node Eviction Due To Missing Network Heartbeat
Node Eviction Due T0 Missing Disk Heartbeat
Node Eviction Due To Member Kill Escalation

11g R2 RAC: Reboot-less Node Fencing

11g R2 RAC :Reboot-less Fencing With Missing Disk Heartbeat

11g R2 RAC: Reboot-less Fencing With Missing Network Heartbeat

——————-

11g DATAGUARD SETUP USING ACTIVE DATAGUARD

November 8, 201211g DataGuard, UncategorizedAnju Garg

In this blog we’ll see how to configure 11g Data Guard using active dataguard i.e. standby database will be created from running primary database . We do not need to take backup of database or controlfile. Primary database dg01 on node1.oracle.com (192.9.201.192)

Standby database dg02 on node2.oracle.com (192.9.201.193)

Listeners :

primary : on port 1521 in grid home

on port 1522 in database home

standby : on port 1521 in grid home

on port 1522 in database home

TNS entries

primary : dg01 on node1.oracle.com on ports 1521, 1522

registered with both the listeners

dg02 on node2.oracle.com on port 1522

primary : dg02 on node2.oracle.com on ports 1521, 1522

registered with both the listeners

dg01 on node1.oracle.com on port 1522

Configuration on primary site:-

——————————————-

- As root assign

. IP address 192.9.201.192

. hostname node1.oracle.com

#service network restart

- Make entries in /etc/hosts

192.9.201.182 node1.oracle.com node1

192.168.2.102 node2.oracle.com node2

– Reboot machine (mandatory as hostname has been changed)

– create dg01 database in archivelog mode , with sample schemas, with FRA default

— Invoke netmgr from db home

- make service dg01 available on port 1522
- create a listener in database home on port 1522
- statically register dg01 with listener on port 1522
- create service for dg02 running on node2.oracle.com on port 1522
- delete orcl database

Configuration on standby site:-

——————————————-

- As root assign

. IP address 192.9.201.193

. hostname node2.oracle.com

#service network restart

— Reboot machine (mandatory as hostname has been changed)

- Make entries in /etc/hosts

192.9.201.182 node1.oracle.com node1

192.9.201.193 node2.oracle.com node2

– Test

#ping node1

ping node2

– Invoke netmgr from db home

- create service for dg01 running on node1.oracle.com on port 1522
- create service for dg02 running on node2.oracle.com on port 1522
- create a listener in database home on port 1522
- statically register dg02 with it

—————————–
On primary node01

—————————–

— Configure the database dg01 in archive log mode and enable the force logging.

SQL>SHUTDOWN IMMEDIATE;

    STARTUP MOUNT;

    ALTER DATABASE ARCHIVELOG;

    ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE=6G;

    ALTER DATABASE FORCE LOGGING;

    select force_logging from v$database;

    ALTER DATABASE OPEN;

– Configure standby redo log files to enable fast failover and real time apply. (Assume we have three redo log group)

– Find out no. and size of current redo log groups

SQL>select group#, bytes/1024/1024 MB   from v$log;

– Add standby log files –

Note: – Number of standby redo log group must be more than number of redo log group and size of the standby logfile must be same as the size of redo log file.

SQL>  ALTER DATABASE ADD STANDBY LOGFILE GROUP 4 '/u01/app/oracle/oradata/dg01/sredo04.log' size 50m;

      ALTER DATABASE ADD STANDBY LOGFILE GROUP 5 '/u01/app/oracle/oradata/dg01/sredo05.log' size 50m;

      ALTER DATABASE ADD STANDBY LOGFILE GROUP 6 '/u01/app/oracle/oradata/dg01/sredo06.log' size 50m;

      ALTER DATABASE ADD STANDBY LOGFILE GROUP 7 '/u01/app/oracle/oradata/dg01/sredo07.log' size 50m;

– check that standby logfiles have been created

SQL> col member for a30

           select group#, TYPE, MEMBER from v$logfile where type = 'STANDBY';

GROUP# TYPE MEMBER

———- ——- ——————————

4 STANDBY /u01/app/oracle/oradata/dg01/sredo04.log

5 STANDBY /u01/app/oracle/oradata/dg01/sredo05.log

6 STANDBY /u01/app/oracle/oradata/dg01/sredo06.log

7 STANDBY /u01/app/oracle/oradata/dg01/sredo07.log

SQL> select group#, bytes/1024/1024 MB from v$standby_log;

GROUP# MB

———- ———-

4 50

5 50

6 50

7 50

– Make a staging folder

oracle@node1$mkdir /home/oracle/stage

— CREATE PFILE FROM SPFILE.

PRI>create pfile='/home/oracle/stage/initdg01.ora' from spfile;

- Now configure following parameters for primary database:-

– add following lines to initdg01.ora

oracle@node1$vi /home/oracle/stage/initdg01.ora

LOG_ARCHIVE_CONFIG=’DG_CONFIG=(dg01,dg02)’

DB_UNIQUE_NAME=’dg01′

SERVICE_NAMES=’dg01′

DB_FILE_NAME_CONVERT=’/u01/app/oracle/oradata/dg02′,’/u01/app/oracle/oradata/dg01′

LOG_FILE_NAME_CONVERT=’/u01/app/oracle/oradata/dg02′,’/u01/app/oracle/oradata/dg01′

LOG_ARCHIVE_DEST_1=’location=USE_DB_RECOVERY_FILE_DEST’

LOG_ARCHIVE_DEST_STATE_1=enable

LOG_ARCHIVE_DEST_2=’service=dg02 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLES) DB_UNIQUE_NAME=dg02′

LOG_ARCHIVE_DEST_STATE_2=enable

STANDBY_FILE_MANAGEMENT=auto

FAL_CLIENT=’dg01′

FAL_SERVER=’dg02′

LOG_ARCHIVE_FORMAT=’%t_%s_%r.arc’

Note:

In case source and destination databases are ASM, parameters db_file_name_convert and log_file_name_convert may be defined as follows:

db_file_name_convert = (“+DATA/dg01″,”+DATA/dg02″)

log_file_name_convert = (“+DATA/dg01″,”+DATA/dg02″, “+FRA/dg01″,”+FRA/dg02″)

– create spfile from pfile and start dg01 using new parameters

PRI>shu immediate;

    create spfile from pfile='/home/oracle/stage/initdg01.ora';

    startup;

—– Take the entire database backup on pre configured staging directory(optional with active dataguard)

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP ON;

      BACKUP FORMAT '/home/oracle/stage/%U' DATABASE PLUS ARCHIVELOG;

      BACKUP FORMAT '/home/oracle/stage/%U' CURRENT CONTROLFILE FOR STANDBY;

      sql "alter system archive log current";

– create staging folder on node2

oracle@node2$mkdir /home/oracle/stage

—- Copy the following items from node1 to node2.

PASSWORD FILE

[oracle@node1 ~]$scp /u01/app/oracle/product/11.2.0/db_1/dbs/orapwdg01 node2:/u01/app/oracle/product/11.2.0/db_1/dbs/orapwdg02

PARAMETER FILE and rman backups(optional with active dataguard)

[oracle@node1 ~]$ scp /home/oracle/stage/* node2:/home/oracle/stage/

Note: – RMAN backup must be copied at the same directory on machine2

i.e. /home/oracle/stage

– Configure net service for dg01 (primary database) on node1 at port 1522

as well as for dg02 (standby database) on node2 at port 1522

set environment to dg01 and invoke netmgr

DG02 =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = node2.oracle.com)(PORT = 1522))

)

(CONNECT_DATA =

(SERVICE_NAME = dg02)

)

DG01 =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = node1.oracle.com)(PORT = 1522))

)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = dg01)

)

– Create a listener in database home on port 1522 and statically register dg01 with it.
—————————-

On standby node02

—————————–

- Edit Pfile for Standby Database:

oracle@node2$mv /home/oracle/stage/initdg01.ora /home/oracle/stage/initdg02.ora

             vi initdg02.ora

— Final pfile should look like this

dg02.__db_cache_size=440401920

dg02.__java_pool_size=4194304

dg02.__large_pool_size=4194304

dg02.__shared_pool_size=163577856

dg02.__streams_pool_size=0
*.audit_file_dest=’/u01/app/oracle/admin/dg02/adump’

*.control_files=’/u01/app/oracle/oradata/dg02/control01.ctl’

*.core_dump_dest=’/u01/app/oracle/admin/dg02/cdump’
*.db_block_size=8192

*.db_domain=”

*.db_file_multiblock_read_count=16

*.db_name=’dg01′

*.db_recovery_file_dest=’/u01/app/oracle/flash_recovery_area’

*.db_recovery_file_dest_size=6G

*.dispatchers='(PROTOCOL=TCP) (SERVICE=dg02XDB)’

*.job_queue_processes=10

*.log_archive_format=’%t_%s_%r.dbf’

*.open_cursors=300

*.pga_aggregate_target=204472320

*.processes=150

*.remote_login_passwordfile=’EXCLUSIVE’

*.sga_target=613416960

*.undo_management=’AUTO’

*.undo_tablespace=’UNDOTBS1′

*.user_dump_dest=’/u01/app/oracle/admin/dg02/udump’

LOG_ARCHIVE_CONFIG=’DG_CONFIG=(dg01,dg02)’
DB_UNIQUE_NAME=’dg02′

SERVICE_NAMES=’dg02′

DB_FILE_NAME_CONVERT=’/u01/app/oracle/oradata/dg02,’/u01/app/oracle/oradata/dg01′

LOG_FILE_NAME_CONVERT=’/u01/app/oracle/oradata/dg02′,’/u01/app/oracle/oradata/dg01′

LOG_ARCHIVE_DEST_1=’location=USE_DB_RECOVERY_FILE_DEST’

LOG_ARCHIVE_DEST_STATE_1=enable

LOG_ARCHIVE_DEST_2=’service=dg01 VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLES) DB_UNIQUE_NAME=DG01′
LOG_ARCHIVE_DEST_STATE_2=enable

STANDBY_FILE_MANAGEMENT=auto
FAL_CLIENT=’dg02′

FAL_SERVER=’dg01′

LOG_ARCHIVE_FORMAT=’%t_%s_%r.arc’
@node2: Create the required directory for standby database

[oracle@node2 ~]$ mkdir -p  /u01/app/oracle/admin/dg02/adump

                  mkdir -p  /u01/app/oracle/oradata/dg02/

                  mkdir -p  /u01/app/oracle/flash_recovery_area/dg02

@node2:– Configure net service for dg01 (primary database) on node1 at port 1522

as well as for dg02 (standby database) on node2 at port 1522

– set oracle home to database home and start netmgr
DG02 =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = node2.oracle.com)(PORT = 1522))

)

(CONNECT_DATA =

(SERVICE_NAME = dg02)

)

DG01 =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = node1.oracle.com)(PORT = 1522))

)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = dg01)

)

@node2 –create a listener on port 1522 in database home and statically dg02 service with it

@node2 : Start dg02 at nomount stage

[oracle@node2 ~]$ export ORACLE_SID=dg02

SBY> CREATE SPFILE FROM PFILE='/home/oracle/stage/initdg02.ora';

     STARTUP NOMOUNT ;

     EXIT

@node2: Now using RMAN duplicate command create standby database

[oracle@node2 ~]$ export ORACLE_SID=dg02

RMAN TARGET SYS/oracle@dg01

RMAN> CONNECT AUXILIARY SYS/oracle@dg02

      DUPLICATE TARGET DATABASE FOR STANDBY from active database;

– Switch logs on primary

PRI>alter system switch logfile;

– start redo apply on standby

SBY>alter database recover managed standby database using current logfile disconnect;

– Verify that standby database is performing correctly

SBY>select sequence#, first_time, next_time from v$archived_log order by sequence#;

– Perform a few log switches on primary

PRI>alter system switch logfile;

alter system switch logfile;

alter system switch logfile;

alter system switch logfile;

– On standby database verify that new redo data was received , archived and applied

SBY>select sequence#, first_time, next_time,applied from v$archived_log order by sequence#;

——————————

Related links:

Home
11g Dataguard Index

11g Dataguard: Automatic Block Media Recovery (BMR)

11g Dataguard: Automatic Client Failover

11g Dataguard: Automatic Job Failover

11g Dataguard: Flashback Standby After Resetlogs On Primary

ORA-12528: TNS:listener: all appropriate instances are blocking new connection

Recover Standby Datafile From Primary

Warning: ORA-16792: configurable property value is inconsistent with database setting

————

HOW DOES 11G R2 CLUSTERWARE START ASM WHEN ASM SPFILE IS STORED ON ASM ITSELF?

November 8, 201211g R2 RAC, UncategorizedAnju Garg

Beginning with the version 11g Release 2, the ASM spfile is stored automatically in the first disk group created during Grid Infrastructure installation.

Since voting disk/OCR are stored on ASM, ASM needs to be started on the node. To startup ASM, its SPfile is needed. But SPFILE is again located on ASM diskgroup only. How does clusterware resolve this issue?

- When a node of an Oracle Clusterware cluster restarts, OHASD is started by platform-specific means. OHASD accesses OLR (Oracle Local Registry) stored on the local file system to get the data needed to complete OHASD initialization

- OHASD brings up GPNPD and CSSD. CSSD accesses the GPNP Profile stored on the local file system which contains the following vital bootstrap data;
a. ASM_DISKSTRING parameter (if specified) to locate the disks on which ASM disks are configured.
b. ASM SPFILE location : Name of the diskgroup containing ASM spfile

c. Location of Voting Files : ASM

– CSSD scans the headers of all ASM disks ( as indicated in ASM_DISKSTRING in GPnP profile) to identify the disk containing the voting file. Using the pointers in ASM disk headers, the Voting Files locations on ASM Disks are accessed by CSSD and CSSD is able to complete initialization and start or join an existing cluster.

– To read the ASM spfile during the ASM instance startup, it is not necessary to open the disk group. All information necessary to access the data is stored in the device’s header. OHASD reads the header of the ASM disk containing ASM SPfile (as read from GPnP profile) and using the pointers in disk header, contents of ASM spfile are read. Thereafter, ASM instance is started.

– With an ASM instance operating and its Diskgroups mounted, access to Clusterware’s OCR is available to CRSD.

– OHASD starts CRSD with access to the OCR in an ASM Diskgroup.

– Clusterware completes initialization and brings up other services under its control.

Demonstration :

In my environment, the ASM disk group DATA created with EXTERNAL redundancy is used exclusively for ASM spfile, voting and OCR files:

- Let us read gpnp profile to find out the location of ASM SPfile

[grid@host01 peer]$ cd /u01/app/11.2.0/grid/gpnp/host01/profiles/peer

gpnptool getpval -asm_spf

Warning: some command line parameters were defaulted. Resulting command line:

/u01/app/11.2.0/grid/bin/gpnptool.bin getpval -asm_spf -p=profile.xml -o-

+DATA/cluster01/asmparameterfile/registry.253.793721441

– Let us find out the disks in DATA diskgroup

[grid@host01 peer]$ asmcmd lsdsk -G DATA

Path

ORCL:ASMDISK01

ORCL:ASMDISK010

ORCL:ASMDISK02

ORCL:ASMDISK03

ORCL:ASMDISK04

ORCL:ASMDISK09

– Let us find out which ASM disk maps to which partition

Note down major/minor device numbers of the disks in DATA diskgroup

[root@host01 ~]# ls -lr /dev/oracleasm/disks/*

brw-rw—- 1 grid asmadmin 8, 26 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK09

brw-rw—- 1 grid asmadmin 8, 25 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK08

brw-rw—- 1 grid asmadmin 8, 24 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK07

brw-rw—- 1 grid asmadmin 8, 23 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK06

brw-rw—- 1 grid asmadmin 8, 22 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK05

brw-rw—- 1 grid asmadmin 8, 21 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK04

brw-rw—- 1 grid asmadmin 8, 19 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK03

brw-rw—- 1 grid asmadmin 8, 18 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK02

brw-rw—- 1 grid asmadmin 8, 31 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK014

brw-rw—- 1 grid asmadmin 8, 30 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK013

brw-rw—- 1 grid asmadmin 8, 29 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK012

brw-rw—- 1 grid asmadmin 8, 28 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK011

brw-rw—- 1 grid asmadmin 8, 27 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK010

brw-rw—- 1 grid asmadmin 8, 17 Nov 8 09:35 /dev/oracleasm/disks/ASMDISK01

- Let us find out the major/minor device numbers of various disk partitions

[root@host01 ~]# ls -lr /dev/sdb*

brw-r—– 1 root disk 8, 25 Nov 8 09:35 /dev/sdb9

brw-r—– 1 root disk 8, 24 Nov 8 09:35 /dev/sdb8

brw-r—– 1 root disk 8, 23 Nov 8 09:35 /dev/sdb7

brw-r—– 1 root disk 8, 22 Nov 8 09:35 /dev/sdb6

brw-r—– 1 root disk 8, 21 Nov 8 09:35 /dev/sdb5

brw-r—– 1 root disk 8, 20 Nov 8 09:35 /dev/sdb4

brw-r—– 1 root disk 8, 19 Nov 8 09:35 /dev/sdb3

brw-r—– 1 root disk 8, 18 Nov 8 09:35 /dev/sdb2

brw-r—– 1 root disk 8, 31 Nov 8 09:35 /dev/sdb15

brw-r—– 1 root disk 8, 30 Nov 8 09:35 /dev/sdb14

brw-r—– 1 root disk 8, 29 Nov 8 09:35 /dev/sdb13

brw-r—– 1 root disk 8, 28 Nov 8 09:35 /dev/sdb12

brw-r—– 1 root disk 8, 27 Nov 8 09:35 /dev/sdb11

brw-r—– 1 root disk 8, 26 Nov 8 09:35 /dev/sdb10

brw-r—– 1 root disk 8, 17 Nov 8 09:35 /dev/sdb1

brw-r—– 1 root disk 8, 16 Nov 8 09:35 /dev/sdb

– Now we can find out the partitions mapping to various ASM disks by matching their

major/minor device numbers

ASMDISK01 8,17 /dev/sdb1

ASMDISK02 8,18 /dev/sdb2

ASMDISK03 8,19 /dev/sdb3

ASMDISK04 8,21 /dev/sdb5

ASMDISK09 8,26 /dev/sdb10

ASMDISK10 8,27 /dev/sdb11

– Let’s scan the headers of those devices:

[root@host01 ~]# kfed read /dev/sdb1 | grep -E ‘spf|ausize’