In this post, I will explain the concept of services and their benefits in 11g R2 RAC.

What is a service?

A service is a logical grouping of sessions performing similar kind of work. They  enable you to group database workloads and route work to the optimal instances that are assigned to offer the service.It is recommended that all the users who connect using a service have the same service-level requirements. We can use services to manage workloads or a group of applications. For example , all on-line users can use one service whereas batch users can use another service and reporting can use yet another service.

Benefits of services:

– Services make the work more manageable, measurable, tunable and recoverable.

– Services provide an abstraction layer and permit clients and middle tiers to access required data from the database independent of where the instances reside.

– Dynamic: As the load increases/reduces, the number of instances supporting a service can be increased/decreased accordingly.

- High availability : In case an instance crashes, the services supported by it failover to another surviving instance.

- Tuning and Monitoring Aid

. Tracing can be enabled for all the sessions who connect using a service: This can be used to identify the culprit SQL statements issued by the users connected via a service.  . Performance statistics are collected and aggregated by service: This can be used to find out the performance of all the sessions belonging to a service.

. Two Performance thresholds can be specified for a service: SERVICE_ELAPSED_TIME and SERVICE_CPU_TIME: The AWR monitors the service time and CPU time and publishes alerts when the performance esceeds the thresholds. In response to these alerts, you can change the priority of a job, stop overloaded process , or relocate, expand, shrink , start or stop a service.

- Resource Management Aid : We can bind resource consumer groups to services. The users who connect using a service will  get the resources as per the allocation specified for the respective consumer group in the active resource plan. For example, if OLTP and BATCH services are defined to run on the same instance and our requirement is as follows:

. During day time OLTP users should get more resources whereas BATCH users should get less resources

  . During night, BATCH users should get more resources whereas OLTP users should get less resources

We can have two consumer groups : OLTP_GRP mapped to OLTP service and

                                                                          BATCH_GRP mapped to BATCH service
so that all users who connect using OLTP service are assigned OLTP_GRP and all users who connect using BATCH service are assigned BATCH_GRP .

We can have two resource plans :

  . DAY_PLAN in which OLTP_GRP is given more resources and BATCH_GRP is given less resources.
  . NIGHT_PLAN in which BATCH_GRP is given more resources and OLTP_GRP is given less resources.
- Job Control Aid : We can associate jobs with a job class and job class can be mapped to service.

– If service is mapped to a consumer group, the jobs will get the resources as per the allocation specified for the respective consumer group in the active resource plan. If  jobs of a job class are taking more time than desired, we can change the consumer group mapping of the respective service to one with more resources.

– If the instance running the jobs of a job class crashes, the services on the crashed instance fail over to another instance and all the jobs of the job class mapped to thst service also failover along with it.

       Job1  —–+
       Job2  ——|——– Jobclass1 —— Service1
       Job3  ——+                                  |
                                                  Consumer group
                                                 Resource Plan

– Using Transparent Application Failover:  When a node crashes, all the services running on the instance hosted by that node fail over to another instance. From now onwards, the users who try to connect using failed over services will connect to the instances supporting that service. But what happens to the sessions who were already connected using that service at the time of crash? That is decided by the TAF (Transparent application Failover) policy which can be defined for the service. We can have 3 types of TAF policies : Basic, Preconnect and None.

BASIC failover. In this approach, the application connects to a backup node only after the primary connection fails. This approach has low overhead, but the end user experiences a delay while the new connection is created.

PRECONNECT failover. In this approach, the application simultaneously connects to both a primary and a backup node. This offers faster failover, because a pre-spawned connection is ready to use. But the extra connection adds everyday overhead by duplicating connections.

  We can also specify FAILOVER_TYPE for a service i.e. what kind of activity going on in the connected sessions will be resumed after the failover .  It can be Select or Session.

SELECT failover. With SELECT failover, Oracle Net keeps track of all SELECT statements issued during the transaction, tracking how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to another Oracle RAC node and re-executes the SELECT statements, repositioning the cursors so the client can continue fetching rows as if nothing has happened. The SELECT failover approach is best for data warehouse systems that perform complex and time-consuming transactions.

 SESSION failover. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to another Oracle RAC node; any work in progress is lost. SESSION failover is ideal for online transaction processing (OLTP) systems, where transactions are small.
The following types of transactions do not automatically fail over and must be restarted by TAF:
  • Transactional statements. Transactions involving INSERT, UPDATE, or DELETE statements are not supported by TAF.
  •  ALTER SESSION statements. ALTER SESSION and SQL*Plus SET statements do not fail over.
  • Transactions using temporary segments in the TEMP tablespace and global temporary tables do not fail over.
  • PL/SQL package states. PL/SQL package states are lost during failover.
– Connection Load balancing : If we have multiple instances supporting a service, We can spread user connections across all those instances. For each service, you can define the method that you want the listener to use for load balancing by setting the connection load balancing goal, CLB_GOAL. It can have two values : CLB_GOAL_LONG(default) and CLB_GOAL_SHORT.
Where’s the configuration of services stored?
Data Dictionary (Bulk of service configuration)
Client-side (in tnsnames.ora)
Special entries for PRECONNECT services
Resource profile in Cluster Registry (OCR)
Start/stop script
Restart policy
Instance list
Preferred/available instances
TAF policy
Initialization Parameters
Related links:

11G R2 RAC Index
Instance Recovery In RAC
Job Failover In RAC
Service Created Using EM Does Not Update DD or OCR

Your comments and suggestions are welcome!