In RAC, if a node crashes, a jobs running on that node can automatically failover to another node provided the job has been mapped to a service by means of job class.
Let’s demonstrate this concept.
Current scenario:
Name of the cluster : cluster01.example.com
No. of nodes : 2 (host01, host02)
Name of the database : orcl
– Create demo table hr.test
SQL> create table hr.test(dt date);
– Create a database service job_serv with
. preferred instance : orcl1
. available instance : orcl2
[oracle@host01 ~]$ srvctl add service -s job_srv -d orcl -a orcl2 -r orcl1 [oracle@host01 ~]$ srvctl status service -d orcl Service job_serv is running on instance(s) orcl1 Service job_srv is not running. preferred node : host01, available node host02
– Create tnsnames.ora entry for service job_serv
job_srv = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = cluster01-scan.cluster01.example.com)(PORT = 1521))(LOAD_BALANCE = YES)(CONNECT_DATA = (SERVER = DEDICATED)(SERVICE_NAME = job_srv))
– Start the service and check that service is running on host01 (preferred instance)
[oracle@host01 ~]$ srvctl start service -d orcl
CRS-5702: Resource 'ora.orcl.job_serv.svc' is already running on 'host01'
[oracle@host01 ~]$ srvctl status service -d orcl
Service job_srv is running on instance(s) orcl1
– Create job class JC1 which maps to the service job_serv
SQL>BEGIN
sys.dbms_scheduler.create_job_class(
logging_level => DBMS_SCHEDULER.LOGGING_RUNS,
service => 'job_srv',
job_class_name => '"JC1"');
END;
/
– Create a job JOB1 belonging to job class JC1 which inserts a row in table hr.test every 3 seconds
SQL>BEGIN
sys.dbms_scheduler.create_job(
job_name => '"SYS"."JOB1"',
job_type => 'PLSQL_BLOCK',
job_action => 'begin
insert into hr.test values (sysdate);
end;',
repeat_interval => 'FREQ=SECONDLY;INTERVAL=3',
start_date => systimestamp at time zone 'Asia/Calcutta',
job_class => '"JC1"',
auto_drop => FALSE,
enabled => TRUE);
END;
/
– Check that job is running i.e. rows are getting inserted in table hr.test
SQL> select to_char(dt, 'dd-mon-yyyy hh24:mi:ss') from hr.test; TO_CHAR(DT,'DD-MON-YYYYHH24:M ----------------------------- 11-jan-2014 04:12:57 11-jan-2014 04:12:47 11-jan-2014 04:12:49 ... 11-jan-2014 04:13:27 11-jan-2014 04:13:30 11-jan-2014 04:13:33
– Check that job is running the on same node as the service job_serv i.e. host01
SQL> col job_name for a7 col status for a15 select JOB_NAME, status , INSTANCE_ID from ALL_SCHEDULER_JOB_RUN_DETAILS where job_name = 'JOB1'; JOB_NAM STATUS INSTANCE_ID ------- --------------- ----------- JOB1 SUCCEEDED 1 JOB1 SUCCEEDED 1 JOB1 SUCCEEDED 1 ..
– To simulate the crash of host01, kill the database instance on host01
[root@host01 ~]# ps -ef |grep pmon_
grid 5493 1 0 03:43 ? 00:00:00 asm_pmon_+ASM1
oracle 6755 1 0 03:47 ? 00:00:00 ora_pmon_orcl1
root 25051 6336 0 04:17 pts/1 00:00:00 grep pmon_
[root@host01 ~]# kill -9 6755
– Check that job has failed over to host02
[oracle@host01 ~]$ srvctl status service -d orcl Service job_srv is not running. [oracle@host01 ~]$ srvctl status service -d orcl Service job_srv is running on instance(s) orcl2 select JOB_NAME, status , INSTANCE_ID from ALL_SCHEDULER_JOB_RUN_DETAILS where job_name = 'JOB1' ; JOB_NAM STATUS INSTANCE_ID ------- --------------- ----------- JOB1 SUCCEEDED 1 JOB1 SUCCEEDED 1 JOB1 SUCCEEDED 1 ... JOB1 SUCCEEDED 2 JOB1 SUCCEEDED 2 JOB1 SUCCEEDED 2
– Check that the job is still running i.e. records are still getting inserted in the table hr.test
SQL> select to_char(dt, 'dd-mon-yyyy hh24:mi:ss') from hr.test; TO_CHAR(DT,'DD-MON-YYYYHH24:M ----------------------------- 11-jan-2014 04:12:57 11-jan-2014 04:12:47 11-jan-2014 04:12:49 ... 11-jan-2014 04:15:27 11-jan-2014 04:15:30 11-jan-2014 04:15:33
—————————-
Good demo…..
Thanx Nimai!!
Regards
Anju Garg
Excellent
Is a job class required for failover to occur? I have a job that hangs with DBMS_JOB so I implemented DBMS_SCHEDULER with max duration. My sniper job worked in test but in prod the sniper fails with ORA-16639: specified instance inactive or currently unavailable. The job remains “running”. If I log onto the physical node the job was submitted under, Oracle says the request has been sent but the job never stops.
Hello Patrick
As far as I know, job class is needed for a job to fail over. When you map the job class to the service and the node hosting the job/service crashes, service fails over to another node and in the process all the jobs belonging to the corresponding job class also fail over along with the service.
Hope it helps
Anju
getting error ORA-12514: TNS:listener does not currently know of service requested in connect descriptor when connecting through client to service “job_srv”.
Is my service entry in tnsnames.ora is correct?
[oracle@racdev1 admin]$ vi tnsnames.ora
job_srv.world =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan.dev.com)(PORT = 1521))
(CONNECT_DATA = (SERVER = DEDICATED)(SERVICE_NAME = job_srv))
)
[oracle@racdev1 admin]$ srvctl config service -d dev -s job_srv
Service name: job_srv
Service is enabled
Server pool: dev_job_srv
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Preferred instances: dev1
Available instances: dev2
Hi Naveed,
What is the output of lsnrvtl services command?
Also show complete command in response to which you are getting ORA-12514.
Regards
Anju
Thanks Anju for the quick response. There was an issue with my tnsnames.ora entry i need to use domain dev.com. After this the services started working fine.
Thanks a lot