11G DATAGUARD : AUTOMATIC CLIENT FAILOVER

In this post, I will demonstrate automatic client failover which is a new feature introduced in 11g dataguard.

Prior to 11g, one or more user written database triggers were required to automate client failover depending upon configuration.

Dataguard 11g release 2 simplifies configuration significantly by eliminating need for user written triggers to automate client failover. Role transitions managed by dataguard broker can automatically

– failover the database ,

– start the appropriate services on the new primary database,

– disconnect clients from the failed database and

– redirect them to the new primary database — no manual intervention is required.

OVERVIEW:

- Create a service orcls for the database and start it

- Create net service name for the service which points to both the primary and standby database

- Create a trigger which starts the service on the primary database

- Connect to the primary database (orcl) using the service orcls

- Issue a long running query

- While the query is running, perform switchover to standby

- After the switchover, check that query has failed over to the new primary

IMPLEMENTATION

– CHECK THE SERVICE_NAMES PARAMETER ON PRIMARY

PRI>sho parameter service_names

NAME TYPE VALUE

———————————— ———– ——————————

service_names string orcl

– CREATE A SERVICE ORCLS

PRI>exec dbms_service.create_service(-

service_name => ‘orcls’,-

network_name => ‘orcls’,-

failover_method => ‘BASIC’, –

failover_type => ‘SELECT’,-

failover_retries => 180,-

failover_delay => 1);

– CHECK THAT LISTENER DOES NOT KNOW ABOUT THE SERVICE AS IT HAS NOT BEEN STARTED YET

$ lsnrctl services |grep orcls

– START THE SERVICE

PRI>exec dbms_service.start_service(‘orcls’);

– CHECK THAT SERVICE APPEARS IN PARAMETER SERVICE_NAMES

PRI>sho parameter service_names

NAME TYPE VALUE

———————————— ———– ——————————

service_names string orcls

– CHECK THAT SERVICE IS AUTOMATICALLY REGISTERED WITH LISTENER

PRI>ho lsnrctl services |grep orcls

Service “orcls” has 1 instance(s).

– CREATE AFTER STARTUP TRIGGER WHICH STARTS THE SERVICE ON –

THE PRIMARY DATABASE AND STOPS IT ON STANDBY DATABASE

PRI>create or replace trigger manage_service after startup on database

declare

role varchar2(30);

begin

select database_role into role from v$database;

IF role = ‘PRIMARY’ THEN

DBMS_SERVICE.START_SERVICE(‘orcls’);

ELSE

DBMS_SERVICE.STOP_SERVICE(‘orcls’);

END IF;

END;

——————————-

Pls note that both primary and standby databases should be registered statically with listener in database home. Here database home listener is running on port 1521 for both standby and primary databases.

IF BOTH PRIMARY AND STANDBY DATABASES ARE ON SAME MACHINE

. create net service name orcls

– service name = orcls

– add address on host1 port 1521 (representing both primary database orcl and standby database sby)

– NOW WHEN SERVICE IS AVAILABLE ON ORCL (PRIMARY DATABASE), LISTENER WILL CONNECT TO ORCL DATABASE

– AND WHEN SERVICE IS AVAILABLE ON SBY (STANDBY DATABASE FOLLWING FAILOVER/SWITCHOVER), LISTENER WILL CONNECT TO SBY DATABASE

———————————–

IF PRIMARY AND STANDBY DATABASES ARE ON DIFFERENT MACHINES

– USING NETMGR

. create net service name orcls

– service name = orcls

– add address on host1 port 1521 (representing primary database orcl)

– add address on host2 port 1521 (representing standby database sby)

– NOW WHEN SERVICE CONNECTS TO ORCL PRIMARY DATABASE, IT WILL BE REGISTERED AUTOMATICALLY WITH LISTENER ON PORT 1521 ON HOST1 AND APPLICATION WILL GET CONNECTED TO ORCL DATABASE.

– NOW WHEN SERVICE CONNECTS TO SBY STANDBY DATABASE, IT WILL BE REGISTERED AUTOMATICALLY WITH LISTENER ON PORT 1521 ON HOST2 AND APPLICATION WILL GET CONNECTED TO SBY DATABASE.

———————————

– CREATE A USER DGTEST

PRI>create user dgtest identified by dgtest;

grant create session to dgtest;

grant select_catalog_role to dgtest;

PRESENTLY, I HAVE BOTH PRIMARY AND STANDBY ON SAME MACHINE

– CHECK CURRENT CONFIGURATION

Primary – orcl

Standby sby

DGMGRL>show configuration;

– CHECK THAT SERVICE IS REGISTERED WITH LISTENER (1521) AND POINTS TO ORCL (PRIMARY DATABASE)

$lsnrctl services

– CONNECT USING SERVICE ORCLS AND CHECK THAT DB_UNIQUE_NAME=ORCL (PRIMARY)

PRI>connect dgtest/dgtest@orcls

sho parameter db_unique_name

— CHECK THAT SESSION HAS NOT FAILED OVER TILL NOW (failed_over = no)

PRI>select failover_method, failover_type, failed_over

from v$session

where username=’DGTEST';

– ISSUE A LONG RUNNING QUERY IN THIS SESSION AND SIMULTANEOUSLY SWITCHOVER TO SBY IN ANOTHER DGMGRL SESSION.

PRI>select * from dba_objects, dba_tables;

DGMGRL>switchover to sby;

– WHILE SWITCHOVER IS GOING ON CHECK THAT QUERY IS TEMPORARILY SUSPENDED

- AFTER SBY DATABASE IS STARTED, CHECK THAT SERVICE ORCLS IS POINTS TO SBY (NEW PRIMARY) NOW

$lsnrctl services

– CHECK THAT QUERY IS RESUMED AFTER SWITCHOVER

– CHECK CURRENT CONFIGURATION

DGMGRL>show configuration;

– CHECK THAT SESSION HAS FAILED OVER TO SBY DATABASE

PRI>sho parameter db_unique_name

– CHECK THAT SESSION HAS FAILED OVER

PRI>select failover_method, failover_type, failed_over

from v$session

where username=’DGTEST';

– — RESTORE EARLIER CONFIGURATION

DGMGRL>switchover to orcl;

We can use client failover feature of 11g dataguard to automatically failover a running job also to standby in case of switchover/failover.

—————————————————————————–

5 thoughts on “11G DATAGUARD : AUTOMATIC CLIENT FAILOVER”

Jochen says:

December 6, 2014 at 2:10 am

Do you know if there are troubles with database trigger / activ dataguard for Version 11.2.0.4.4 – when switchover to standby and than back ?

1. Anju Garg says:
  
  December 6, 2014 at 7:39 am
  
  I haven’t tried in 11.2.0.4.4
  
  Regards
  Anju
  
Allan Chase says:

February 4, 2016 at 2:55 pm

I realize that this is an older thread, but the core concept still exists (so thanks for keeping this out there!). I followed the directions against a 11.2.0.4 database (on Solaris 10) and my clients don’t actually switchover during a switchover event (the trigger works though!). Basically my C++ retry database connect code errors out with a 12514 since that endpoint is “stopped” on the new “standby”; well, it keeps trying over and over again. The only thing I can think of is that I’ll have to put high availability event handling code in my application (OCI TAF), does that sound right?

1. Anju Garg says:
  
  February 4, 2016 at 10:46 pm
  
  Hi Allan,
  
  You are correct. You need to put high availability event handling code in your application (OCI TAF).
  Hope it helps
  
  Regards
  Anju Garg
  
monemailmagique says:

April 5, 2016 at 10:55 am

Hi,

I’m sorry but this article is an old fashion way to do it. You said that prior to 11G, you need to create a trigger, so in 11G you don’t need it. You shouldn’t create a trigger everything is handled by srvctl.

Best Regards,
B.