WORKLOAD MANAGEMENT IN 11g R2 RAC : LOAD BALANCING

In this post, I will discuss about various types of load balancing techniques that can be configured in 11g R2 RAC for workload management.
In RAC we have multiple instances of a database running on multiple servers.
Workload management involves :
   - Failover : If connection to an instance fails, client should automatically connect to another instance
   - Load balancing : Workload should spread across various instances to obtain maximum performance/throughput
Failover can be
  - Connect Time Connection Failover (CTCF) while making initial connection ()
  - Run time connection failover (RTCF) after connection has been established or (transparent application failover (TAF)
     FAILOVER_METHOD = NONE, BASIC, PRECONNECT
     FAILOVER TYPE   = SESSION, SELECT
Load balancing can be :
  - Connect time load balancing (CTLB)
    . On client side
    . On server side
  –  Run time load balancing (RTLB)
TO configure failover/load balancing, tnsnames.ora should contain multiple listener addresses to connect to multiple instances e.g.
Multiple Listener addresses within a description: i.e. User is trying to connect to a service which is supported by multiple instances.
RAC=
(DESCRIPTION=
   (ADDRESS_LIST=
     (ADDRESS= (PROTOCOL=TCP) (HOST=node1-vip) (PORT=1521))
     (ADDRESS= (PROTOCOL=TCP) (HOST=node2-vip) (PORT=1521))
     (ADDRESS= (PROTOCOL=TCP) (HOST=node3-vip) (PORT=1521))
     (ADDRESS= (PROTOCOL=TCP) (HOST=node4-vip) (PORT=1521))
                      )
    (CONNECT_DATA= (SERVICE_NAME= RAC))
       )
In case SCAN is used, SCAN name is used in the address which resolves to 3 SCAN listeners.
e.g.
RAC=
(DESCRIPTION=
   (ADDRESS_LIST=
     (ADDRESS= (PROTOCOL=TCP) (HOST=cluster01-scan) (PORT=1521))
                      )
    (CONNECT_DATA= (SERVICE_NAME= RAC))
       )
In this post, I will discuss in detail about Load Balancing. To know more about failover , please click here.
LOAD BALANCING 
————–
Load balancing in RAC implies distributing the workload over multiple instances accessing the same physical database. Two kinds load balancing can be configured
  – Connect time load balancing (CTLB)
    . On client side
    . On server side
  – On server side – Run time load balancing (RTLB)
Connect time load balancing (CTLB) : This enables user to connect to one of the instances supporting the service. The connection stays with the same instance until the user disconnects orthe session is killed. It can be configured on the client side and/or server side.
- Connect time load balancing on client side: When a user session attempts to connect to the database, Oracle Net chooses an address specified in tnsnames.ora to connect to in a random order rather than sequential order. This has the effect of clients connecting through addresses which are picked up at random and no one address is overloaded. Its configuration is quite simple. You just need to set the parameter LOAD_BALANCE=ON in the client connection definition in tnsnames.ora. For example :
RAC=
(DESCRIPTION=
  (ADDRESS_LIST=
                       (LOAD_BALANCE=ON)
    (ADDRESS= (PROTOCOL=TCP) (HOST=node1-vip) (PORT=1521))
                       (ADDRESS= (PROTOCOL=TCP) (HOST=node2-vip) (PORT=1521))
    (ADDRESS= (PROTOCOL=TCP) (HOST=node3-vip) (PORT=1521))
    (ADDRESS= (PROTOCOL=TCP) (HOST=node4-vip) (PORT=1521))
                      )
   (CONNECT_DATA= (SERVICE_NAME= RAC)
       )
LOAD_BALANCE parameter is set to ON by default. When this parameter is set to ON, Oracle Net Services progresses through the list of listener addresses in a random sequence, balancing the load on several listeners. When it is set to OFF, the addresses are tried out sequentially until one succeeds.
When using SCAN in the connection definition, Oracle database randomly connects to one of the available SCAN listeners in a round robin fashion and balances the connections on the three scan listeners. Here is a sample tnsnames.ora for a RAC database using SCAN.
RAC=
(DESCRIPTION=
    (LOAD_BALANCE=ON)
    (ADDRESS= (PROTOCOL=TCP) (HOST=SCAN-HOSTNAME) (PORT=1521))
                  )
   (CONNECT_DATA= (SERVICE_NAME= RAC)
       )
Limitation of connect time load balancing : The connection stays with the same instance for the life of a session. If connection lasts a long time, it might be possible that load of current instance increases and some other less loaded instance might be preferable. In that case we would like the connection to switch the other more appropriate instance. This can be achieved by using Run time Load Balancing (RTLB).
- Connect time load balancing on server side: After a listener receives the connection request, it can forward the request to another instance based on the connect time load balancing goal (CLB_GOAL)specified for the service. CLB_GOAL can be :
LONG(Default) - used for application connections that are connected for a long period such as third party connection pools and SQL*Forms applications. In this case,
   . the listener will load balance on number of sessions
   . Run time load balancing goal will not be used in this case
SHORT - used for application connections that are short in duration. This should be used with connection pools integrated with the load balancing advisory. In this case, listener uses Load Balancing Advisory (LBA) to make the connection based on CPU utilization on the node.
Limitation of Connect time load balancing on client side : The listener has no idea if the session has been assigned to an endpoint whose corresponding database server is already overloaded. Hence, timeouts can occur if the node is heavily loaded and unable to respond quickly.Hence, to overcome this problem, server side connect time load balancing needs to be configured. It is useful to spread initial connection load among all listeners inthe cluster. Client may then be redirected based on server side load balancing.
Run Time (Server side) load Balancing (RTLB): In this case, the listener routes incoming client connections according to policies and based on the current service level provided by the database instances. The listener determines the connection distribution depending upon profile statistics that  are dynamically updated by PMON. The higher the load on the node, the more frequently PMON updates the load profile.Thus connections may be switched depending upon changes in cluster configuration, application wrokload  overworked nodes or hangs.
   The core of server side laod balancing id Dynamic service registration so that a services are registered with all the listeners. Since PMON on each node sends load profile to all the listeners with which the service is registered,  all the  listeners come to know about load profile of all the instances and hence the connection is forwarded to the most appropriate listener depending upon the goal of the run time load balancing.
  Run time load balancing is achieved using connection pools. Work requests are automatically balanced across the pool of connections.The connection allocation is based on the current performance level provided by the database instances as indicated by the LBA FAN events. This provides load balancing at the transaction level instead of load balancing at the time of initial connection. 
With server-side load balancing, the listener directs a connection request to the best instance currently providing the service by using the load balancing advisory.
Load Balancing Advisory
- is an advisory for balancing work across RAC instances
- Monitors workload activity for a service across all instances in the cluster
- Analyzes the service level for each instance based on defined metric goal
    Metric: service time (GOAL_SERVICE_TIME)
    Metric: throughput (GOAL_THROUGHPUT)
- Publishes FAN events recommending amount of work to be sent to each instance and data quality  flag
- Default is Off.
-  Directs work to where services are executing well and resources are available
- Adjusts distribution for different power nodes, different priority and shape workloads, changing demand
- Stops sending work to slow, hung, failed nodes early
How to configure server side load balancing:
1. set parameters LOCAL_LISTENER and REMOTE_LISTENER
2. set CLB_GOAL = SHORT for the service
3. set RTLB_GOAL for the service
1. set parameters LOCAL_LISTENER and REMOTE_LISTENER
LOCAL_LISTENER parameter should be set to name of the listener defined in the same node
REMOTE_LISTENER should be set to names of the listeners running on other nodes
For example in a 3 node setup
Host01 running instance orcl1
Host02 running instance orcl2
Host03 running instance orcl3
For host01,
local_listener  - (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) HOST=host01-vip)(PORT=1521))))
remote_listener  - (DESCRIPTION=(ADDRESS_LIST=
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host02-vip)(PORT=1521)
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host03-vip)(PORT=1521))))
For host02,
local_listener  - (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) HOST=host02-vip)(PORT=1521))))
remote_listener  - (DESCRIPTION=(ADDRESS_LIST=
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host01-vip)(PORT=1521)
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host03-vip)(PORT=1521))))
For host03,
local_listener  - (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) HOST=host03-vip)(PORT=1521))))
remote_listener  - (DESCRIPTION=(ADDRESS_LIST=
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host02-vip)(PORT=1521)
                                 (ADDRESS=(PROTOCOL=TCP) HOST=host01-vip)(PORT=1521))))
When we start the three instances the corresponding PMON processes get dynamically registered with all the listeners and start feeding listeners with load profile information. Now all the listeners come to know about load profile of all the instances and hence the connection is forwarded to the listener of the least loaded node.
When SCAN is used, remote_listener parameter should be set to SCAN name on all the nodes i.e.
remote_listener – myrac-cluster-scan:1521
So that PMON process of each instance registers the database services with the default listener on the local node and with each SCAN listener, which is specified by the REMOTE_LISTENER database parameter.
2. set CLB_GOAL = SHORT for the service
   EXECUTE DBMS_SERVICE.MODIFY_SERVICE
   (service_name => ‘sjob’ -
   , clb_goal => DBMS_SERVICE.CLB_GOAL_SHORT);
OR
   srvctl modify service
      -s orcl_serv database -d orcl
      -j SHORT   // connection load balancing goal {long|short}
3. set RTLB_GOAL for the service
A request for a connection is serviced by selecting a connection based on the service goal as determined by the Load Balancing Advisory. The service goal determines whether the connection provides best service quality, that is, how efficiently a single transaction completes, or best throughput, that is, how efficiently an entire job or long-running query completes.
For RTLB, we can define service level goal which will be used only if CLB_GOAL= SHORT
There are 3 options available
NONE – Default setting, you are not taking advantage of this feature
THROUGHPUT – Work requests are directed based on throughput.  THROUGHPUT should be used when the work in a service completes at homogenous rates.  An example is a trading system where work requests are similar lengths. Attempts to direct work requests according to throughput. The load balancing advisory analyzes the service level for each instance based on the service time and is based on the rate that work is completed in the service plus available bandwidth to the service. For example for the use of THROUGHPUT is for workloads such as batch processes,trading system work requests have similar lengths and  next job starts when the last job completes:
EXECUTE DBMS_SERVICE.MODIFY_SERVICE
 (service_name => ‘sjob’ -
  , goal => DBMS_SERVICE.GOAL_THROUGHPUT );
OR
srvctl modify  service
      -s orcl_serv database -d orcl
      -B throughput      // runtime connection load balancing goal { service_time|throughput | none}
SERVICE_TIME – Work requests are directed based on response time. SERVICE_TIME should be used when the work in a service completes at various rates.  In this case, Load balancing advisory data is based on elapsed time for work done in the service plus available bandwidth to the service. An example for the use of SERVICE_TIME is for workloads such as internet shopping where the rate of demand changes and work requests are of differing various lengths.:
EXECUTE DBMS_SERVICE.MODIFY_SERVICE
(service_name => ‘OE’ -
, goal => DBMS_SERVICE.GOAL_SERVICE_TIME -
);
OR
srvctl modify  service
      -s orcl_serv database -d orcl
      -B service_time      // runtime connection load balancing goal { service_time|throughput | none}
 You can see the goal settings for a service in the data dictionary and in the DBA_SERVICES, V$SERVICES, and V$ACTIVE_SERVICES views.
SUMMARY:
Workload management involves :
   - Failover : If connection to an instance fails, client should automatically connect to another instance
   - Load balancing : Workload should spread across various instances to obtain maximum performance/throughput
Load balancing can be :
  – Connect time load balancing (CTLB)
    . On client side
    . On server side
  –  Run time load balancing (RTLB)
TO configure failover/load balancing, tnsnames.ora should contain multiple listener addresses to connect to multiple instances
Connect time load balancing (CTLB) : This enables user to connect to one of the instances supporting the service. The connection stays with the same instance until the user disconnects orthe session is killed. It can be configured on the client side and/or server side.
- Connect time load balancing on client side: When a user session attempts to connect to the database, Oracle Net chooses an address specified in tnsnames.ora to connect to in a random order rather than sequential order. This has the effect of clients connecting through addresses which are picked up at random and no one address is overloaded. Its configuration is quite simple. You just need to set the parameter LOAD_BALANCE=ON in the client connection definition in tnsnames.ora.
Limitation of Connect time load balancing (CTLB) on client side : The listener has no idea if the session has been assigned to an endpoint whose corresponding database server is already overloaded.
- Connect time load balancing (CTLB) on server side: After a listener receives the connection request, it can forward the request to another instance based on the connect time load balancing goal (CLB_GOAL)specified for the service. CLB_GOAL can be : LONG or SHORT
LONG(Default) – In this case,
   . the listener will load balance on number of sessions
   . Run time load balancing goal will not be used in this case
SHORT - used for application connections that are short in duration. This should be used with connection pools integrated with the load balancing advisory. In this case, listener uses Load Balancing Advisory (LBA) to make the connection based on CPU utilization on the node.
Limitation of connect time load balancing : The connection stays with the same instance for the life of a session. If connection lasts a long time, it might be possible that load of current instance increases and some other less loaded instance might be preferable.
Run Time (Server side) load Balancing (RTLB): In this case, the listener routes incoming client connections according to policies and based on the current service level provided by the database instances. With server-side load balancing, the listener directs a connection request to the best instance currently providing the service by using the load balancing advisory.
To configure server side load balancing:
1. set parameters LOCAL_LISTENER and REMOTE_LISTENER
2. set CLB_GOAL = SHORT for the service
3. set RTLB_GOAL for the service
———————————————————————————

7 thoughts on “WORKLOAD MANAGEMENT IN 11g R2 RAC : LOAD BALANCING

  1. Hey, thanks for this wonderful post. I was just not able to understand something reading over and over again the Oracle docs. Keep us the good work.

  2. Hello Thanks for this great Article, I have following questing and would be much happy if you provide me the answer, In case if you don’t have any defined service in your RAC setup and you have only the default database service for example RACDB then in this case would it be possible to configure Run-Time failover and Run Time connection load balancing since the connection load balancing goal for default service is LONG. Thanks

Your comments and suggestions are welcome!