Category Archives: Database

CONSISTENT READS IN ORACLE : PART – I

 
  In this post I will discuss and demonstrate how oracle creates CR (consistent read) blocks.
 

Read consistency implementation in Oracle

 
    Whenever a user queries data, he is to be displayed the data as of the time/SCN# when he queried it. 
If the desired block is not in the buffer,
    The block is read from the disk (status=xcur) 
If  clone(s) of the block are already there in buffer
      The clone just later than the desired time is selected.
      Undo is applied to the clone so as to make it consistent with the time/SCN# of the query (status=cr).
If the block is in the buffer and is dirty and no clones are there
    A copy(clone) of the current  block is made.
    Undo is applied to the clone so as to make it consistent with the time/SCN# of the query (status=cr)
 
DML is always performed on the current copy of the block (Status=xcur). 
Whenever an update is made to the block, 
  If the current block is in the buffer (status = xcur)
    A clone of the current block is made (status=cr)
    Update is performed on the current block (status=xcur)
Else (the desired block is not in the buffer)
    The block is read from the disk (status=xcur) 
     A clone of the block is also made (status=cr)
    Update is performed on the current block (status=xcur)
 
As multiple updates/queries are issued for the same block, multiple CR blocks of the same block may be created. Since all the CR clones of the same block are on the same chain, the processes searching for the block will take long time to scan the chain and there will be contention for CBC latch. Hence Oracle has limited the maximum no. of CR clones of a block  to 6 (one current (xcur) and 5 CR clones). But it is a soft limit in the sense that if more than 5 processes are simultaneously updating different rows in the same block, more than 6 CR blocks may be created. The maximum no. of CR clones of a block can be modified by an undocumented parameter _db_block_max_cr_dba.
 
Let us see a practical demonstration of the above concept:
 
– Create simple table whose one row occupies one block
HR1> drop table t1 purge;
    create table t1
    (c1 int, c2 char(2000), c3 char(2000), c4 char(2000));
– Insert one row
HR1> insert into t1 values(1, ‘x’, ‘x’, ‘x’);
     commit;
 
– Get the block address and browse X$BH view for that block.
HR> select dbms_rowid.rowid_relative_fno(rowid) as file#,
           dbms_rowid.rowid_block_number(rowid) as block#
    from t1;
File#         Block#
——        ——
    4        550
 
– Check in x$bh that We have one CURRENT block(state=xcur). CURRENT block is always one and only one.
– Enter file#/block# found above when prompted
SYS>select b.dbarfil, b.dbablk, b.class,
  decode(state,0,’free’,1,’xcur’,2,’scur’,3,’cr’, 4,’read’,5,’mrec’,6,’irec’,7,’write’,8,’pi’, 9,’memory’,10,’mwrite’,11,’donated’) as state,
  cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq,
  (select object_name from dba_objects where object_id = b.obj) as object_name
from sys.x$bh b
where dbarfil = &file_no and
      dbablk = &block_no;
 
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
 
Now, let’s see how consecutive DML statements generate CR blocks.
– update and monitor cr block
– 1st update
HR1> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
– Check in x$bh that We have one CURRENT block(state=xcur) and on cr clone (status=cr)
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24244872          0          0          0          0 T1
 
– 2nd update
HR> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
– Note that another cr clone has been created. Each clone has a different SCN# as of the time when it was created.
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
 DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24244981          0          0          0          0 T1
         4        550          1 cr           24244872          0          0          0          0 T
 
– 3rd update
HR1> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
– Note that another cr clone has been created. Each clone has a different SCN# as of the time when it was created.
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
  DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24245051          0          0          0          0 T1
         4        550          1 cr           24244981          0          0          0          0 T1
         4        550          1 cr           24244872          0          0          0          0 T1
 
– 4th update
HR1> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
– Note that another cr clone has been created. Each clone has a different SCN# as of the time when it was created.
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24245082          0          0          0          0 T1
         4        550          1 cr           24245051          0          0          0          0 T1
         4        550          1 cr           24244981          0          0          0          0 T1
         4        550          1 cr           24244872          0          0          0          0 T1
 
– 5th update
HR> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
– Note that another cr clone has been created. Each clone has a different SCN# as of the time when it was created.
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24245122          0          0          0          0 T1
         4        550          1 cr           24245082          0          0          0          0 T1
         4        550          1 cr           24245051          0          0          0          0 T1
         4        550          1 cr           24244981          0          0          0          0 T1
         4        550          1 cr           24244872          0          0          0          0 T1
 
6 rows selected.
 
– 6th update
HR1> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
– Note that new cr clone has not been created because the limit of the 6 (1 xcur and 5 cr) has been reached.
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24245135          0          0          0          0 T1
         4        550          1 cr           24245122          0          0          0          0 T1
         4        550          1 cr           24245082          0          0          0          0 T1
         4        550          1 cr           24245051          0          0          0          0 T1
         4        550          1 cr           24244981          0          0          0          0 T1
 
6 rows selected.
 
– Check the value of the parameter _db_block_max_cr_dba
 
SELECT 
  a.ksppinm  “Parameter”,
  decode(p.isses_modifiable,’FALSE’,NULL,NULL,NULL,b.ksppstvl) “Session”,
  c.ksppstvl “Instance”,
  decode(p.isses_modifiable,’FALSE’,’F’,’TRUE’,’T’) “S”,
  decode(p.issys_modifiable,’FALSE’,’F’,’TRUE’,’T’,’IMMEDIATE’,’I’,’DEFERRED’,’D’) “I”,
  decode(p.isdefault,’FALSE’,’F’,’TRUE’,’T’) “D”,
  a.ksppdesc “Description”
FROM x$ksppi a, x$ksppcv b, x$ksppsv c, v$parameter p
WHERE a.indx = b.indx AND a.indx = c.indx
  AND p.name(+) = a.ksppinm
  AND UPPER(a.ksppinm) LIKE UPPER(‘%&1%’)
ORDER BY a.ksppinm;
 
Enter value for 1: _db_block_max_cr_dba
old  12:   AND UPPER(a.ksppinm) LIKE UPPER(‘%&1%’)
new  12:   AND UPPER(a.ksppinm) LIKE UPPER(‘%_db_block_max_cr_dba%’)
 
Parameter                                          Session                      Instance                                    S I D Description
————————————————– —————————- ——————————————————- – – – ——————————————————————————————
_db_block_max_cr_dba                                                            6                                           F F F Maximum Allowed Number of CR buffers per dba
 
– Increase the value of the parameter to 8
 
SYS>alter system set “_db_block_max_cr_dba”= 8 scope=spfile;
    shu immediate;
    startup;
 
– Issue update seven times and check that 2 more clones have been created
 
HR1> update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
        update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
  update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
  update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
      update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
    update t1 set c2 = ‘y’, c3 = ‘y’, c4 = ‘y’ where c1 = 1;
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24246329          0          0          0          0 T1
         4        550          1 cr           24246328          0          0          0          0 T1
         4        550          1 cr           24246327          0          0          0          0 T1
         4        550          1 cr           24246325          0          0          0          0 T1
         4        550          1 cr           24246324          0          0          0          0 T1
         4        550          1 cr           24246323          0          0          0          0 T1
         4        550          1 cr           24246322          0          0          0          0 T1
 
8 rows selected.
 
– Issue a query from another session of HR
 
HR2>select * from t1;
 
– Check that a new clone has been created and an earlier one with 
   scn#=24246322 has been overwritten by scn#=24246435 and undo tablespace has been read to create this one. (uba_file=3,uba_blk=174754)
 
SYS>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
 
 
  DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———- ———- ——————–
         4        550          1 cr           24246435          0          3     174754       1022 T1
         4        550          1 xcur                0          0          0          0          0 T1
         4        550          1 cr           24246329          0          0          0          0 T1
         4        550          1 cr           24246328          0          0          0          0 T1
         4        550          1 cr           24246327          0          0          0          0 T1
         4        550          1 cr           24246325          0          0          0          0 T1
         4        550          1 cr           24246324          0          0          0          0 T1
         4        550          1 cr           24246323          0          0          0          0 T1
 
—————————————————————————–

In my next post , I will demonstrate that _db_block_max_cr_dba is a soft limit and CR clones more than its value can be created.
—————————————————————————————————-
Related links:

 

———————–

CLUSTERING FACTOR DEMYSTIFIED : PART – III

 

How to resolve the performance issues due to high clustering factor?

In my earlier post, Clustering Factor Demystified : Part – I, I had discussed that to improve the Clustering Factor,  the table must be rebuilt (and reordered). The data retrieval can be considerably speeded up by physically sequencing the rows in the same order as the key column. If we can group together the rows for a key value,  we can get all of the row with a single block read because the rows are together.  To achieve this goal, various methods may be used. In the post Clustering Factor Demystified : Part -II, I had demonstrated Manual Row Re-sequenciung (CTAS with order by) which pre-orders data to avoid expensive disk sorts after retrieval. In this post, I will demonstrate the use of  Single table hash clusters  and  Single table index clusters which  clusters related rows together onto the same data block .

Overview: 

  •  Create a table organized which contains two columns  -  id(number) and txt (char)- Populate the table insert 34 records for each value of id where id ranges from 1 to 100- In this case as records are added sequentially,  records for a key value are stored together
  •  Create another table unorganized which is a replica of the ‘organized’ table but records are inserted in a random manner so that records for a key value may be scattered across different blocks .
  •  Create a single table index cluster  table from ‘unorganized’ table using CTAS.
  •  Create a single table hash cluster table from ‘unorganized’  table using CTAS
  •  Trace  the query using exact match  on three tables and verify that hash cluster table gives the best performance .
  •  Trace the query using range scan  on three tables and verify that index cluster table gives the best performance .
  •  Verify that index and hash cluster tables have better clustering factor .
 Implementation:
- Create a table organized which contains two columns  –  id(number) and txt (char)
– Populate the table insert 34 records for each value of id where id ranges from 1 to 100
– In this case as records are added sequentially,  records for a key value are stored together
SQL> drop table organized purge;
           create table organized (id number(3), txt char(900));
           begin
           for i in 1..100 loop
               insert into organized select i, lpad(‘x’, 900, ‘x’)
               from    dba_objects where rownum < 35;
           end loop;
          end;
          /
-  create another table unorganized which is a replica of the ‘organized’ table but records are inserted in a random manner so that records for a key value may be scattered across different blocks (order by dbms_random.random).
SQL> drop table unorganized purge;
           create table unorganized as select * from organized order by dbms_random.random;

           create index unorganized_idx on unorganized(id);

        exec dbms_stats.gather_table_stats(USER, ‘unorganized’, estimate_percent => 100, method_opt=> ‘for all indexed columns size 254′);

– Create a single table index cluster  table from ‘unorganized’ table using CTAS.

— Create a cluster with size = blocksize = 8k and index it

SQL> drop cluster index_cluster including tables;
           create cluster index_cluster
            ( id number(3) )
          size 8192;


          create index index_cluster_idx
           on cluster index_cluster;
        drop table index_cluster_tab purge;       
       create table index_cluster_tab 
           cluster index_cluster( id )
           as select * from unorganized ;
 – Create a single table hash cluster table from ‘unorganized’  table using CTAS 
 SQL>drop tablespace mssm including contents and datafiles;  
        Create tablespace mssm datafile ‘/u01/app/oracle/oradata/orcl/mssm01.dbf’ size 100m     segment space management manual;   
         drop cluster hash_cluster including tables;
           create cluster Hash_cluster
           ( id number(3) )
             size 8192 single table hash is id hashkeys 100 tablespace mssm; 
        drop table hash_cluster_tab purge;     
        create table hash_cluster_tab          cluster hash_cluster(id)
           as select * from unorganized;
begin
    dbms_stats.gather_table_stats
    ( user, ‘UNORGANIZED’, cascade=>true );    dbms_stats.gather_table_stats
    ( user, ‘INDEX_CLUSTER_TAB’, cascade=>true );    dbms_stats.gather_table_stats
    ( user, ‘HASH_CLUSTER_TAB’, cascade=>true );
End;
/
- Find out no. of blocks across which records of a key value are spread in the two tables.- Note that in ‘unorganized’  table,  records  for an id are scattered across more than 30 blocks whereas in index_cluster_tab and hash_cluster_tab tables, records for each id are clustered i.e. records for each key value are spread across 5 blocks only.
SQL> select unorg.id id, unorg.cnt unorganized_blocks,
           idx_tab.cnt index_cluster_blocks, hash_tab.cnt hash_cluster_blocks
         from
             ( select id,  count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
             from unorganized     
              group by id) unorg,
           ( select id,  count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
             from index_cluster_tab     
              group by id) idx_tab,
           ( select id,  count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
             from hash_cluster_tab     
              group by id) hash_tab 
        where idx_tab.id = unorg.id
          and hash_tab.id = unorg.id
        order by id;
ID              UNORGANIZED_BLOCKS INDEX_CLUSTER_BLOCKS HASH_CLUSTER_BLOCKS
———-             ——————                 ——————–                                     ——————-

1                 33                    5                   5

2                 34                    5                   5

3                 33                    5                   5

4                 33                    5                   5

5                 33                    5                   5

6                 32                    5                   5

7                 34                    5                   5

8                 34                    5                   5

9                 33                    5                   5

10                 32                    5                   5

90                 34                    5                   5

91                 34                    5                   5

92                 30                    5                   5

93                 34                    5                   5

94                 34                    5                   5

95                 34                    5                   5

96                 34                    5                   5

97                 33                    5                   5

98                 33                    5                   5

99                 34                    5                   5

100               34                    5                   5

- Trace  the query using exact match  on three tables and verify that hash cluster table gives the best performance .
– Let’s compare the statistics when rows for all the id’s are retrieved in succession from the three tables
conn / as sysdba
alter session set tracefile_identifier = ‘cluster_factor';
alter session set sql_trace=true;
declare
    type tab_row is table of unorganized%rowtype;
    tab_rows tab_row;
    type id_val is table of unorganized.id%type;
    id_vals id_val;   
begin
    select distinct id bulk collect into id_vals
      from unorganized;   for k in id_vals.first .. id_vals.last
    loop
        select * bulk collect into tab_rows
          from unorganized
         where id = k;
    end loop;
end;
/
declare
    type tab_row is table of index_cluster_tab%rowtype;
    tab_rows tab_row;
    type id_val is table of index_cluster_tab.id%type;
    id_vals id_val;    begin
    select distinct id bulk collect into id_vals
      from index_cluster_tab;    for k in id_vals.first .. id_vals.last
    loop
        select * bulk collect into tab_rows
          from index_cluster_tab
         where id = k;
    end loop;
end;
/declare
    type tab_row is table of hash_cluster_tab%rowtype;
    tab_rows tab_row;
    type id_val is table of hash_cluster_tab.id%type;
    id_vals id_val;    begin
    select distinct id bulk collect into id_vals
      from hash_cluster_tab;    for k in id_vals.first .. id_vals.last
    loop
        select * bulk collect into tab_rows
          from hash_cluster_tab
         where id = k;
    end loop;
end;
/
– Find out the name of trace file generatedSQL> col trace_file for a100
           select  value trace_file from v$diag_info
           where upper(name) like ‘%TRACE FILE%';
TRACE_FILE
—————————————————————————————————
/u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_ora_24963_cluster_factor.trc– Run tkprof utility on the trace file generated $cd /u01/app/oracle/diag/rdbms/orcl/orcl/trace
   rm cluster_factor.out   tkprof /u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_ora_8403_cluster_factor.trc cluster_factor.out

   vi cluster_factor.out

********************************************************************************
Here are the contents of the trace file:

In case of unorganized table, it can be seen that no. of blocks visited (3517) is approaches the number of rows (3400) in the table as rows for an id  are scattered across a large no. of blocks.

SQL ID: 0npa78p7jkfa5

Plan Hash: 1120857569

SELECT *
FROM
UNORGANIZED WHERE ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
——- ——  ——– ———- ———- ———- ———-  ———-
Parse        1      0.00       0.00          0          0          0           0
Execute    100      0.00       0.00          0          0          0           0
Fetch      100      0.01       0.02          0       3517          0        3400
——- ——  ——– ———- ———- ———- ———-  ———-
total      201      0.01       0.02          0       3517          0        3400

*******************************************************************************
In case of single table index cluster, 
Total  I/O’s  = I/O’s against the table + I/O’s against the table

i/O’s against the table = no. of table blocks across which various records for different id’s are stored

Since we saw earlier that records for each key value are scattered across 5 blocks,

I/O’s against the table =  no. of distinct key values (id’s) * no. of blocks occupied by records for an id
= 100 * 5
= 500

Rest 100 I/O’s are made against against the index ( one I/O for each key value)

Hence total I/O’s = 100 + 500 = 600
********************************************************************************

SQL ID: 6qy378ww4729s
Plan Hash: 3651720007
SELECT *
FROM
INDEX_CLUSTER_TAB WHERE ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
——- ——  ——– ———- ———- ———- ———-  ———-
Parse        1      0.00       0.00          0          0          0           0
Execute    100      0.00       0.00          0          0          0           0
Fetch      100      0.00       0.00          0        600          0        3400
——- ——  ——– ———- ———- ———- ———-  ———-
total      201      0.00       0.00          0        600          0        3400

*******************************************************************************
In case of single table hash cluster, as index access is not needed,
Total  I/O’s  = I/O’s against the table
=  no. of distinct key values (id’s) * no. of blocks occupied by records for an id
= 100 * 5
= 500
********************************************************************************

SQL ID: ctnu91v20p2x2

Plan Hash: 3860562250

SELECT *
FROM
HASH_CLUSTER_TAB WHERE ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
——- ——  ——– ———- ———- ———- ———-  ———-
Parse        1      0.00       0.00          0          0          0           0
Execute    100      0.00       0.00          0          0          0           0
Fetch      100      0.00       0.00          0        500          0        3400
——- ——  ——– ———- ———- ———- ———-  ———-
total      201      0.00       0.00          0        500          0        3400

*******************************************************************************

Summarizing the above results :

unorganized          index_cluster_tab      hash_cluster_tab
total  CPU       0.01                             0.00                   0.00

elapsed            0.01                              0.02                   0.00
time

I/O’s               3517                              600                    500

Hence, it can be concluded that for exact match queries hash clusters give the best performance since least no. of I/O’s are made.

- Trace the query using range scan  on three tables and verify that index cluster table gives the best performance .

– Let’s compare the statistics when rows for entire range of id’s are retrieved from the three tables

conn / as sysdba
alter session set
tracefile_identifier = ‘cluster_factor';
alter session set
sql_trace=true;
declare
    type tab_row is table of
unorganized%rowtype;
    tab_rows tab_row;
    type id_val is table of
unorganized.id%type;
    id_vals id_val;
    begin
    select distinct id bulk collect into id_vals
     from unorganized;
    for j in id_vals.first..id_vals.first loop
     for k in id_vals.last .. id_vals.last loop
       select * bulk collect into tab_rows
       from unorganized
       where id >= j and id <= k;
     end loop;
    end loop;
end;
/
  
declare
    type tab_row is table of index_cluster_tab%rowtype;
    tab_rows tab_row;
    type id_val is table of index_cluster_tab.id%type;
    id_vals id_val;
    begin
    select distinct id bulk collect into id_vals
    from index_cluster_tab;
  
    for j in id_vals.first..id_vals.first loop
     for k in id_vals.last .. id_vals.last loop
        select * bulk collect into tab_rows
          from index_cluster_tab   
         where id >= j and id <= k;
      end loop;
    end loop;
end;
/
declare
    type tab_row is table of hash_cluster_tab%rowtype;
    tab_rows tab_row;
    type id_val is table of hash_cluster_tab.id%type;
    id_vals id_val;
    begin
    select distinct id bulk collect into id_vals
      from hash_cluster_tab;
    for j in id_vals.first..id_vals.first loop
     for k in id_vals.last .. id_vals.last loop
        select * bulk collect into tab_rows
          from hash_cluster_tab
      where id >= j and id <= k;
     end loop;
    end loop;
end;
/
  
– Find out the name of trace file generated
 SQL> col
trace_file for a100
           select  value trace_file from v$diag_info
           where upper(name) like ‘%TRACE
FILE%';
TRACE_FILE
—————————————————————————————————-
/u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_ora_24963_cluster_factor.trc
– Run tkprof utility on the trace file generated
$cd /u01/app/oracle/diag/rdbms/orcl/orcl/trace
   rm cluster_factor.out
   tkprof /u01/app/oracle/diag/rdbms/orcl1/orcl1/trace  /orcl1_ora_8403_cluster_factor.trc  cluster_factor.out
   vi cluster_factor.out
 Here are the contents of the trace file:

In case of unorganized table, it can be seen that Full table scan is done and

total I/O’s = Physical I/O’s + logical I/O’s

= 486 + 489 = 975

CPU usage = 0.01

elapsed time = 0.01
cost             = 127
********************************************************************************
SQL ID:
dpg9s5v7jannv
Plan Hash:
3859503019
SELECT *
FROM
 UNORGANIZED WHERE ID >= :B2 AND ID <=
:B1
  
call     count
cpu    elapsed       disk
query    current        rows
——-
——  ——– ———- ———-
———- ———-  ———-
Parse        1
0.00       0.00          0          0          0           0
Execute      1
0.00       0.00          0          0          0           0
Fetch        1
0.01       0.01        486        489          0        3400
——-
——  ——– ———- ———-
———- ———-  ———-
total        3
0.01       0.01        486        489          0        3400
Rows     Row Source Operation
——-
—————————————————
   3400
FILTER  (cr=489 pr=486 pw=0
time=3524 us)
   3400
TABLE ACCESS FULL UNORGANIZED (cr=489 pr=486 pw=0 time=1888 us cost=127
size=3073600 card=3400)
********************************************************************************
 In case of index cluster table, it can be seen that index range scan is performed on cluster index followed by table access cluster.

total I/O’s = Physical I/O’s + logical I/O’s

= 385 + (501 + 1)

= 887

CPU usage = 0.01

elapsed time = 0.01
cost             =  101
*******************************************************************************
SQL ID:
22k91ut1b18nj
Plan Hash:
533030663
SELECT *
FROM
 INDEX_CLUSTER_TAB WHERE ID >= :B2 AND ID
<= :B1
call     count
cpu    elapsed       disk
query    current        rows
——-
——  ——– ———- ———-
———- ———-  ———-
Parse        1
0.00       0.00          0          0          0           0
Execute      1
0.00       0.00          0          0          0           0
Fetch        1
0.01       0.01        385        501          0        3400
——-
——  ——– ———- ———-
———- ———-  ———-
total        3      0.01
0.01        385        501          0        3400
Misses in library
cache during parse: 1
Optimizer mode:
ALL_ROWS
Parsing user id:
SYS   (recursive depth: 1)
Rows     Row Source Operation
——-  —————————————————
   3400
FILTER  (cr=501 pr=385 pw=0
time=9441 us)
   3400
TABLE ACCESS CLUSTER INDEX_CLUSTER_TAB (cr=501 pr=385 pw=0 time=7679 us
cost=101 size=3073600 card=3400)
    100
INDEX RANGE SCAN INDEX_CLUSTER_IDX (cr=1 pr=0 pw=0 time=99 us cost=1
size=0 card=1)(object id 80458)
********************************************************************************
 In case of hash cluster table, it can be seen that hash access is made to the table.

total I/O’s = Physical I/O’s + logical I/O’s

= 503 + 506

= 1009

CPU usage = 0.04

elapsed time = 0.04
cost = 132
*******************************************************************************
 SQL ID:
c2www0m7npkqp
Plan Hash:
4115468836
SELECT *
FROM
 HASH_CLUSTER_TAB WHERE ID >= :B2 AND ID
<= :B1
call     count
cpu    elapsed       disk
query    current        rows
——-
——  ——– ———- ———-
———- ———-  ———-
Parse        1
0.00       0.00          0          0          0           0
Execute      1
0.00       0.00          0          0          0           0
Fetch        1
0.04       0.04        503        506          0        3400
——-
——  ——– ———- ———-
———- ———-  ———-
total        3
0.04       0.04        503        506          0        3400
Misses in library
cache during parse: 1
Optimizer mode:
ALL_ROWS
Parsing user id:
SYS   (recursive depth: 1)
Rows     Row Source Operation
——-
—————————————————
   3400
FILTER  (cr=506 pr=503 pw=0
time=39906 us)
   3400
TABLE ACCESS FULL HASH_CLUSTER_TAB (cr=506 pr=503 pw=0 time=37389 us
cost=132 size=3073600 card=3400)
********************************************************************************
 Summarizing above results :
                                unorganized          index_cluster_tab      hash_cluster_tab

total I/O’s                      975                     887                        1009

CPU usage                    0.01                    0.01                        0.04

elapsed time                  0.01                     0.01                        0.04
cost                              127                      101                         132
 
Hence, to search a range of values,  single table index cluster is the best choice.
CPU usage, I/O’s  and cost is the maximum in case of single table hash cluster table.
 
- Verify that index and hash cluster tables have better clustering factor .

Let’s compare clustering factor of indexes on the three tables.

Tables unorganized and index_cluster_tab already have index.

– Let’s create index on hash_cluster_tab and gather statistics .

SQL>create index hash_cluster_idx on hash_cluster_tab(id);

          exec dbms_stats.gather_index_stats(USER, ‘HASH_CLUSTER_IDX’);

          exec dbms_stats.gather_index_stats(USER, ‘INDEX_CLUSTER_IDX’);   

– Find out clustering factor of the three tables.

SQL> select index_name, clustering_factor

           from  user_indexes

          where index_name in (‘UNORGANIZED_IDX’, ‘INDEX_CLUSTER_IDX’, ‘HASH_CLUSTER_IDX’);

INDEX_NAME                     CLUSTERING_FACTOR

—————————— —————–

HASH_CLUSTER_IDX                             500

INDEX_CLUSTER_IDX                            100

UNORGANIZED_IDX                             3311

– Note that

– clustering factor of index on unorganized table approaches no. of rows in the table (3400).

– clustering factor of index on hash_cluster_tab table  = 500 . As entries for each id are spread across 5 blocks, 500 blocks need to be accessed to get all the rows and index is aware of this information.

– clustering factor of index on index_cluster_tab table = 100 as there are 100 entries (one for each id) in the index. Here also 500 table blocks need to be accessed to get all the rows but index contains information about only the first(or may be  the last) data block for an id. Rest 4 blocks containing records for that id are chained to it and index does not have that information and clustering factor of an index is computed on the basis of the information available in the index. That’s why clustering factor in this case = no. of index entries.

SUMMARY:
  •  Clustered tables cannot be truncated.
  •  Choosing the Key :Choosing the correct cluster key is dependent on the most common types of queries issued against the clustered tables. The cluster key should be on the column against which queries are most commonly issued.
 
HASH CLUSTERS
A hash cluster stores related rows together in the same data blocks. Rows in a hash cluster are stored together based on their hash value.
  • – Hash clusters are a great way to reduce IO on some tables, but they have their downside.

   *If too little space is reserved for each
key (small SIZE value)
, or if the cluster is created with too few hash keys (small HASHKEYS), then each key will split across multiple blocks negating the benefits of the cluster.When creating a hash cluster, it is important to choose the cluster key correctly and set the HASH IS, SIZE, and HASHKEYS parameters so that performance and space use are optimal.

   * If too much space is reserved for each key (large SIZE value), or if the cluster is created with too many hash keys (large
HASHKEYS), then the cluster will contain thousands of empty blocks that slow down full table scans .
 A SIZE value much larger results in wasted space.

  •  Hash clusters reduce contention and I/O since index is not accessed .When you use an index range scan + table access by index rowid, the root index block becomes  a “hot block” causing  contention for the cache buffers chains (cbc) latch and hence an increase in CPU usage.
  • Hashing works best for exact match searches i.e. SELECT … WHERE cluster_key = …;  

              A  properly sized hash cluster for a lookup table gives pretty much a SINGLE IO for a keyed lookup.

  •  Hash clusters should only really be used for tables which are static in size so that you can determine the number of rows and amount of space required for the tables in the cluster. If tables in a hash cluster require more space than the initial allocation for the cluster, performance degradation can be substantial because overflow blocks are required.
  • Hash clusters should only really be used for tables which have mostly read-only data. The hash  cluster will take marginally longer
    to insert into since the data now has a “place” to go and maintaining this structure will take longer then maintaining a HEAP table .Updates do not provide much overhead unless the hashkey is being updated.
  •  Hash clusters should not be used in  applications where most queries on the table retrieve rows over a range of cluster key values where a hash function cannot be used to determine the location of specific hash keys and  instead, the equivalent of a full table scan must be done to fetch the rows for the query:
  •  Hash clusters should not be used in applications where hash key is updated. The hashing values can not be recalculated and thus serious overflow can result.
  •  Hash clusters should not be used for tables which are not static and continually growing. If a table grows without limit, the space required over the life of the table (its cluster) cannot be predetermined.
  •   Hash clusters should not be used for when you cannot afford to pre-allocate the space that the hash cluster will eventually
    need.
  •  Hash clusters allocate all the storage for all the hash buckets when the cluster is created, so they may waste space.
  •  Full scans on single table hash clusters will cost as much as they would in a heap table.
INDEX CLUSTERS
In an indexed cluster, Oracle stores together rows having the same cluster key value. Each distinct cluster key value is stored only once in each data block, regardless of the number of tables and rows in which it occurs. This saves disk space and improves performance for many operations.
  •   Index clusters should be used for the apllications where most queries on the table retrieve rows over a range of cluster key values. For example, in full table scans or queries such as the following:

            SELECT . . . WHERE cluster_key < . . . ;

  • With an index, key values are ordered in the index, so cluster key values that satisfy the WHERE clause of a query can be found with relatively few I/Os.
  •   Index clusters should be used for the tables which are not static, but instead are continually growing and the space required over the life of the table (its cluster) cannot be predetermined.
  •  Index clusters should be used for applications which frequently perform full-table scans on the table and the table is sparsely populated. A full-table scan in this situation takes longer under hashing.
  •  Cluster index has one entry per cluster key and not for each row. Therefore, the index is smaller and less costly to access for finding multiple rows.

 References:

————————————————————–

Related links:

Home

Tuning Index

Buffer Cache Wait Events

Clustering Factor Demystified Part-I 

Clustering Factor Demystified Part-II

Direct Read Enhancements in 11g

Single Table Index Clusters

SingleTable Hash Clusters 

SQL Profile Demystified : Part – I

SQL Profile Demystified Part – II

Undocumented Parameters in Oracle 11g

 

———————–

AUTOMATIC DEGREE OF PARALLELISM (DOP) – PART – II

    

 In my last article , I discussed Automatic DOP computation in Oracle 11g Release 2.  In this  article, I will demonstrate the oracle behaviour for various values of PARALLEL_DEGREE_POLICY i.e. Manual, Limited and Auto.
—–      PARALLEL_DEGREE_POLICY= MANUAL —————-
SQL>conn / as sysdba
          alter system set parallel_adaptive_multi_user=false;
          conn sh/sh
         alter session set parallel_degree_policy=manual;
         alter session set parallel_min_time_threshold=auto;
— CREATE TABLES —
— P5 – DICTIONARY DOP = 5
– PDEF – DICTIONARY DOP = DEFAULT
SQL>drop table p5 purge;
          create table p5 parallel 5
         as select * from sales
        where rownum < 5000;
SQL>drop table pdef purge;
   
         create table pdef parallel
        as select * from sales
        where rownum < 5000;
SQL> select table_name,degree
from user_tables
          where table_name in (‘P5′, ‘PDEF’);
TABLE_NAME                    DEGREE
———————————————————————-
P5                                             5
PDEF                                DEFAULT
– CHECK THAT PARALLELISM USED = DICTIONARY DOP IN BOTH THE TABLES –
SQL> select count(*) from p5;
 
          select * from v$pq_sesstat
          where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      5            0
SQL> select  count(*) from pdef;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      2            0
– PRIOR TO 11G R2, TO DECIDE PARALLELISM, THE DBA COULD SPECIFY DIFFERENT
– DICTIONARY DOP’S FOR DIFFERENT OBJECTS
– TO OVERRIDE THE DICTIONARY DOP, HE COULD USE HINTS AT OBJECT LEVEL –
SQL> select /*+ parallel (p5 8) */ count(*) from p5;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      8            0
SQL> select /*+ parallel (pdef 8) */ count(*) from pdef;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      8            0
—–      PARALLEL_DEGREE_POLICY= LIMITED —————-
SQL> alter session set parallel_degree_policy=limited;
– CHECK THAT DOP IS NOT COMPUTED FOR TABLES WITH NON DEFAULT –
– DICTIONARY DOP – FOR TABLE P5, DOP USED = DICTIONARY DOP = 5
SQL> select count(*) from p5;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                    5            0
– CHECK THAT DOP IS  COMPUTED FOR TABLES WITH DEFAULT –
– DICTIONARY DOP – FOR TABLE PDEF, DOP IS COMPUTED TO BE 0 I.E. PARALLELIZATION
– IS NOT REQUIRED CONSIDERING THE SMALL SIZE OF THE OBJECT
SQL> select  count(*) from pdef;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      0            0
—–      PARALLEL_DEGREE_POLICY= AUTO —————-
SQL> alter session set parallel_degree_policy=auto;
– EXECUTE QUERY ON P5
– NOTE THAT DOP USED = 0 I.E. QUERY IS EXECUTED SERIALLY
– BECAUSE WITH CURRENT SIZE OF THE TABLE, SERIAL EXECUTION
– CAN EXECUTE THE STATEMENT WITHIN THE TIME SPECIFIED BY
– PARALLEL_MIN_TIME_THRESHOLD

SQL> select count(*) from p5;

 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                    LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      0            0
– RECREATE THE TABLE WITH MORE ROWS SO THAT PARALLELIZATION
– IS NEEDED TO EXECUTE THE QUERY
SQL>drop table p5 purge;
    create table p5 parallel 5
    as select * from sh.sales;
– REEXECUTE THE QUERY AND CHECK THAT PARALLELIZATION
– IS USED AND DOP USED IS NOT 5 (DICTIONARY DOP) I.E. AUTOMATIC
– DOP IS COMPUTED EVEN IN CASE OF TABLE WHERE DICTIONARY DOP IS
– NOT DEFAULT
     select count(*) from p5;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      2            0
– MODIFY
PARALLEL_MIN_TIME_THRESHOLD TO 20 SECS –
SQL>alter session set parallel_min_time_threshold=20;
EXECUTE THE SAME QUERY AGAIN. CHECK THAT THE QUERY
– IS EXECUTED SERIALLY AS THE TIME REQUIRED TO EXECUTE
– IT SERIALLY IS LESS THAN 20 SECS AND HENCE PARALLELIZATION
– IS NEEDED

SQL>select count(*) from p5;

 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      0            0
SQL>alter session set parallel_min_time_threshold=auto;
– DOP IS COMPUTED COMPUTED FOR THOSE TABLES ALSO FOR WHICH
– DICTIONARY DOP IS DEFAULT. NOTE THAT THE FOLLOWING STATEMENT
– EEXCUTES SERIALLY AS DUE TO SMALL SIZE OF THE TABLE, PARALLELIZATION
– IS NOT REQUIRED
SQL>select  count(*) from pdef;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height’;
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      0            0
– WE CAN OVERRIDE AUTOMATIC COMPUTATION OF DOP BY
GIVING OBJECT LEVEL HINT
SQL>select /*+ parallel (p5 8) */ count(*) from p5;
 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      8            0
– IF TABLE SIZE IS SO SMALL THAT PARALLELISM IS NOT REQUIRED
– EVEN SPECIFIED DOP IN THE HINT IS OVERRIDDEN. IF WE EXECUTE
– QUERY ON PDEF WHOSE SIZE IS SMALL, THE QUERY EXECUTES SERIALLY
– EVEN THOUGH WE HAVE SPECIFIED DOP = 8

SQL>select /*+ parallel (pdef 8) */ count(*) from pdef;

 
     select * from v$pq_sesstat
     where statistic = ‘Allocation Height';
STATISTIC                     LAST_QUERY SESSION_TOTAL
—————————————- ————-
Allocation Height                      0            0
References:
———————————————————————————————

Related links :

Home

Database Index
Automatic Degree Of Paralellism – Part-I
Oracle 11g : Automatic DOP – Parallel Threshold
Oracle 11g :  Parallel Statement Queueing
Parallel_Adaptive_Multi_User
Parallel_Min_Percent
Parallel_Min_Servers

                                                                              ——————

AUTOMATIC DEGREE OF PARALLELISM (DOP) – PART – I

In a serial – non-parallel – execution environment, a single process or thread undertakes  the operations required to process your SQL statement and each action must complete before  the succeeding action can commence.Parallel SQL allows a SQL statement to be processed by multiple threads or processes simultaneously.Oracle supports parallel processing for a wide range of operations, including queries, DDL and DML:
  •  Queries that involve table or index range scans.
  •  Bulk insert, update or delete operations.
  •  Table and index creation.
In earlier versions of the Oracle Database, we had to determine the DOP more or less  manually, either with a parallel hint or by setting a parallel degree with alter table. 
There was an automatic computation of the DOP available for the objects with dictionary DOP of default, derived from the simple formula:
 CPU_COUNT * PARALLEL_THREADS_PER_CPU. 
If there were insufficient parallel servers to satisfy the requested DOP, one of three things could occur:
  • The SQL would be run at a reduced DOP (be downgraded)
  • The SQL would run in serial mode (be serialized)
  • If PARALLEL_MIN_PERCENT was specified and less than the nominated percentage of the DOP was achievable, then the the SQL statement might terminate with “ORA-12827: insufficient parallel query slaves available”.
Oracle 11g release 2 introduced Automatic computation of DOP to parallelize
SQL. Automatic DOP whichis – as the name says – a way of having Oracle determine the degree of parallelism depending on the nature of the operations to be performed and the sizes of the objects involved. The decision is made based on various initialization parameter settings.
1. PARALLEL_DEGREE_POLICY. It can have 3 values : MANUAL, LIMITED and AUTO.
  • MANUAL – This is the default. Disables Auto DOP, statement queuing and in-memoryparallel execution. It reverts the behavior of parallel execution to what it was previous to Oracle Database 11g, Release 2 (11.2)..
  • LIMITED- will only assign a system computed  degree of parallelism (DOP) for tables decorated with a parallel degree of
  • DEFAULT
  1. Tables and indexes that have a specific DOP specified use that specified DOP.   
  2.  Parallel statement queuing and in-memory parallel execution are disabled.
  3. The optimizer computes the optimal degree-of-parallelism value based on estimated  execution statistics..
  • AUTO
  1. will consider to assign a system computed DOP to all tables
  2. Enables parallel statement queuing :If the requested or required DOP is not possible because parallel servers are busy, then Oracle will defer statement execution rather than downgrading or  serializing the SQL until enough parallel slaves become available.
  3. Enables in-memory parallel execution.

2. PARALLEL_MIN_TIME_THRESHOLD : Oracle 11gR2 will ascertain if the query’s estimated execution time is likely to run longer than the acceptable value (in seconds) for PARALLEL_MIN_TIME_THRESHOLD and, if sufficient resources for parallel execution exist right now, it will allow the query to execute; otherwise, it will delay its execution until sufficient resources exist. This helps prevent a single parallel query from consuming excessive resources at the cost of other non-parallelizable operations.The default of this parameter is 10 seconds.

3. PARALLEL_ADAPTIVE_MULTI_USER : If it is set to TRUE, then Oracle will adjust the degree of parallel based on the overall load on the system. When the system is more heavily loaded, then the degree of parallelism will be reduced.

4. If PARALLEL_IO_CAP is set to TRUE in 11g or higher, then Oracle will limit the Degree of  Parallelism to that which the IO subsystem can support. The IO subsystem limits can be calculated by using the procedure DBMS_RESOURCE_MANAGER.CALIBRATE_IO.

5. PARALLEL_MAX_SERVERS : To avoid an arbitrary number of parallel processes to be running on a system, which may overload that system, the parameter parallel_max_servers provides a hard upper boundary. Regardless of any other setting, the degree of parallelism cannot exceed that which can be supported by PARALLEL_MAX_SERVERS. For most SQL statements, the number of servers required will be twice the Degree of Parallelism.

6. PARALLEL_DEGREE_LIMIT : It is the maximum DOP that can be used. It can take various values :

  •  CPU (Default) : the total CPU count of the system
  •  IO : the limit depends on the I/O load on the database host
  •  any number greater than  0.

Note that even if you increase the parameter value, the automatically computed degree of parallelism may not increase, because the database may determine that the computed degree of parallelism is sufficient.The DOP that we run the statement with is the minimum value of the computed DOP (or ideal DOP) and that parallel_degree_limit parameter.

• A degree of parallelism can be specified at the table or index level by using the PARALLEL clause of CREATE TABLE, CREATE INDEX, ALTER TABLE or ALTER INDEX.

• The PARALLEL hint can be used to specify the degree of parallelism for a specific table within a query.

In my next article in this series , I will practically demonstrate oracle behaviour for various values of the parameter PARALLEL_DEGREE_POLICY i.e. MANUAL, LIMITED AND AUTO

References:
http://searchitchannel.techtarget.com/feature/Using-parallel-SQL-to-improve-Oracle-database-performance
http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel002.htm

http://kerryosborne.oracle-guy.com/2014/01/does-parallel_degree_limit-work-with-parallel_degree_policymanual/
http://www.databasejournal.com/features/oracle/oracle-parallel-processing-new-and-improved.html
http://www.pythian.com/blog/secrets-of-oracles-automatic-degree-of-parallelism/
http://hemantoracledba.blogspot.in/2015/02/parallel-execution-1-parallel-hint-and.html?utm_source=feedburner&utm_medium=email&utm_campaign=Feed:+blogspot/JcnHK+(Hemant%27s+Oracle+DBA+Blog)
https://hourim.wordpress.com/2015/02/24/parallel-query-broadcast-distribution-and-temp-space/
http://www.oracle.com/technetwork/articles/datawarehouse/twp-parallel-execution-fundamentals-133639.pdf
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1657271217898
What Is Auto DOP?
Configuring And Controlling Auto DOP
https://blogs.oracle.com/datawarehousing/entry/parallel_degree_limit_parallel_max
In Memory Parallel execution in Oracle Database 11g R2
Parameter Changes For Parallel Execution In Oracel Database 12cR2

———————————————————————————————

Related links :


Home

Database Index
Automatic Degree Of Paralellism – Part-II
Oracle 11g : Automatic DOP – Parallel Threshold
Oracle 11g :  Parallel Statement Queueing
Parallel_Adaptive_Multi_User
Parallel_Min_Percent
Parallel_Min_Servers

                                           —————-

SINGLE TABLE HASH CLUSTERS DEMYSTIFIED


In this post, I will discuss single table hash clusters which we can fetch a row requiring only a single block get most of the time. Hash clusters are quite similar to index clusters, except that the cluster index is missing. The data in the cluster itself acts like an index.
First, let us see how is the data stored in a hash cluster.
Let’s say we have a hash cluster table with two columns id (number) and text (char) where  id is the key column. On  inserting a record with id = 1 into the table- The key column value (1 here) is hashed and  converted into a diskaddress.- The record is placed at the disk address returned above.Now, if  another record is inserted with id = 2, , the key column value 2 is hashed to return a different address. Thus records having different key values will be placed at different disk addresses whereas  all the records with same key value will go to the same address on the disk.
Now, let’s see how the data is retrieved from  a hash cluster as compared to a conventional indexed cluster  table.In a conventional indexed heap table , if there is a unique index on the column being queried, the algorithm followed is :- Perform unique scan of the index to locate the key and get the rowid (i.e. at least 2 I/O’s – one for index root block, one for index leaf block and may be more as blevel of the index increases)

– Using the rowid obtained above, access the table by rowid (i.e. one I/O to get the data)

whereas, the procedure followed in  a hash cluster is :

– Hash the primary key to get the physical location of the record

– Perform single I/O to read the block.

Thus, hash clusters can  reduce I/O on  tables.

How to create a single table hash cluster?

create cluster <cluster_name> (cluster_key <datatype>)

size <size_number> single table hashkeys <hash_keys_number> hash is
<expr>;

Let’s understand various clauses in create cluster command:

CLUSTER_KEY <datatype>

The cluster key should have a single column containing only integers.

Hash clusters having composite cluster keys or cluster keys made up of non integer columns use the internal hash function.

If a non-integer cluster key value is supplied and internal hash function is bypassed , the operation (INSERT or UPDATE statement) is rolled back and an error is returned.

SIZE <size_number>

Specifies the amount of space in bytes reserved in a block to store all rows having the same cluster key value or the same hash value. This space decides the maximum number of cluster or hash values stored in a data block.

SINGLE TABLE

SINGLE TABLE indicates that this  hash cluster can contain only one
table.. However, you may drop the table and create another  table in the
same cluster.

HASHKEYS <hash_keys_number>

Specify the HASHKEYS clause to create a hash cluster and specify the number of hash values for the hash cluster.The HASHKEYS value specifies and limits the number of unique hash values that can be generated by the hash function used by the cluster. (how many distinct values you anticipate for the cluster key over time)Oracle Database rounds up the HASHKEYS value to the nearest prime number to obtain the actual number of hash values. The minimum value for this parameter
is 2. If you omit the HASHKEYS parameter, the database creates an indexed cluster by default.

When you create a hash cluster, the database immediately allocates space for the cluster based on the values of the SIZE and HASHKEYS parameters.

It allocates a hash table to hold HASHKEY number of cluster keys of SIZE bytes each.

HASH IS <expr>

Specify an expression to be used as the hash function for the hash cluster. Must evaluate to a positive value

If you omit the HASH IS clause, then Oracle Database uses an internal hash function for the hash cluster.

—- Overview –

– create two single table hash clusters with
size = 8k , hashkeys = 4

cluster HSH_CLUSTER_ROWSIZ_8K and HSH_CLUSTER_ROWSIZ_4K

— Create  table hash_cluster_tab_ROWSIZE_8k in cluster HSH_CLUSTER_ROWSIZ_8K  with row size such that only one record fits one block

— Create  table hash_cluster_tab_ROWSIZE_4k in cluster HSH_CLUSTER_ROWSIZ_4K  with row size such that two records fit one block

— insert records for 5 distinct key values to both the tables

— Each record goes to a different block as hashkeys has been set to 5 (next prime no.) and each block can have records with only one hash key.

— Try to access record with a key value – single I/O

— Insert records for 6th key values although provision has made for 5 keys only and check that hash collision takes place as id = 6 is mapped to one of the already existing hash values (id=1)

— Add another record for id = 1 and check that multiple blocks containing rows with same hash value are chained together.

– Create 3 clusters with size = 2K, hashkeys = 4  

HSH_CLUSTER_SIZ_2K_ROW_1K,  HSH_CLUSTER_SIZ_2K_ROW_2K and HSH_CLUSTER_SIZ_2K_ROW_4K

— Create a table HSH_TAB_SIZ_2K_ROWSIZ_1K in cluster HSH_CLUSTER_SIZ_2K_ROW_1K with row size such that row size =1/2( specified size) = 1/2(2K) = 1K

— Create a table HSH_TAB_SIZ_2K_ROWSIZ_2K in cluster HSH_CLUSTER_SIZ_2K_ROW_2Kwith row size such that row size =  specified size 2K

— Create a table HSH_TAB_SIZ_2K_ROWSIZ_4K in cluster HSH_CLUSTER_SIZ_2K_ROW_4K with row size such that row size = 2( specified size) = 2(2K) = 4K

— Insert records for 5 distinct key values in the three tables

— Check that each block contains records for a maximum of 3 hashkeys.
— Insert 3 records with id = 1 in all the 3 tables

— Add records for id’s = 6 to 9 i.e. four keys more than what we have defined the cluster for (5).

— Note that it takes time to add these records

— Check that there is collision for hash keys – Multiple key values correspond to the same  hash key – the hash chain for a hash key becomes longer and contains records with different key values.

— Check that overallocation takes place i.e. a block holds rows for hashkeys more than it is expected to hold (3).
– Implementation –

– create two single table hash clusters with
size = 8k , hashkeys = 4

cluster HSH_CLUSTER_ROWSIZ_8K and HSH_CLUSTER_ROWSIZ_4K

– The cluster key column for the clusters is id. The column in table in this cluster does not have to be called ID, but it must be NUMBER(2), to match this definition.

– Also we have specified a SIZE 8K option which means that we expect about 8K bytes of data to be associated with each cluster key value. Oracle will use it to compute the maximum number of cluster keys that could fit per block. Here, maximum no. of hash keys per block = Block size/SIZE = 8K/8K = 1 i.e. each block can contain records having only one hash key.

– The value of HASHKEYS limits the number of unique hash values that can be generated by the hash function used for the cluster. Oracle rounds the number you specify for HASHKEYS to the nearest prime number (5 here as we have set  HASHKEYS to 4 and 5 is the next prime number) i.e.  for any cluster key value, the hash function generates a maximum of 5 values.

SQL>Create tablespace mssm datafile
‘/u01/app/oracle/oradata/orcl/mssm01.dbf’ size 100m

segment space management manual;

    alter system set deferred_segment_creation=FALSE;

    drop  cluster  HSH_CLUSTER_ROWSIZ_8K including tables;

  create cluster HSH_CLUSTER_ROWSIZ_8K     ( id number(2) )
size 8192 single table hash is id hashkeys 4 tablespace mssm;

      drop cluster HSH_CLUSTER_ROWSIZ_4K including tables;

create cluster HSH_CLUSTER_ROWSIZ_4K ( id number(2) )
size 8192 single table hash is id hashkeys 4 tablespace mssm;

Let’s create  tables in  these clusters ..

– Create  table
hash_cluster_tab_ROWSIZE_8k in cluster HSH_CLUSTER_ROWSIZ_8K  with row
size such that only one record fits one block

drop table hash_cluster_tab_rowsize_8k purge;

CREATE TABLE HASH_CLUSTER_TAB_ROWSIZE_8K
(id          number(2) ,
txt1        char(2000),
txt2        char(2000),
txt3        char(2000)
 ) CLUSTER HSH_CLUSTER_ROWSIZ_8K( id );

– Create  table
hash_cluster_tab_ROWSIZE_4k in cluster HSH_CLUSTER_ROWSIZ_4K  with row
size such that two records fit one block

drop table hash_cluster_tab_rowsize_4k purge; 

CREATE TABLE hash_cluster_tab_rowsize_4k
(id          number(2) ,
txt1        char(2000),
txt2        char(1000)CLUSTER HSH_CLUSTER_ROWSIZ_4K( id );

– check the blocks assigned to the clusters –

SQL> col segment_name for a30
select segment_name, extent_id, block_id, blocks  
 from dba_extents where segment_name like ‘%HASH_CLUSTER%';

SEGMENT_NAME    EXTENT_ID   BLOCK_ID     BLOCKS

—————————— ———- ———- ———-

HSH_CLUSTER_ROWSIZ_8K        0       128          8

HSH_CLUSTER_ROWSIZ_4K       0       136          8

– Insert 5 records each with a  distinct
key value to both the tables

SQL>Begin
    for i in 1..5 loop
     insert into hash_cluster_tab_rowsize_8k values (i,  ‘x’, ‘x’, ‘x’);
     insert into hash_cluster_tab_rowsize_4k values (i,  ‘x’, ‘x’);
       end loop;
     commit;
    end;
/

Let’s find out the blocks the records have
gone to ..

– Each record goes to a different block as hashkeys has been set to 5 (next prime no.)
First block assigned to the cluster(128, 136) is used to store header info.

Records occupy blocks second block onwards

Note that record for id=5 goes to the first block as mod (id, hashkeys) decides the block no. for a key and mod (5,5) = 0

0 denotes the first block to accommodate records for the table

Note that records for id= 1  to 4  go to the subsequent  blocks as mod (1,5) = 1, mod(2,5) = 2 …

In hash_cluster_tab_rowsize_4k, even though two rows can fit one block, still each row has gone to a different block   as one block has been assigned to a hash value. and

SQL>select ‘HASH_CLUSTER_TABLE_8K’
TAB_NAME , id,    dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block

from hash_cluster_tab_rowsize_8k t1 
    union
select ‘HASH_CLUSTER_TABLE_4K’, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block
from hash_cluster_tab_rowsize_4k t2
     order by 1,2,3;

TAB_NAME
ID HASH_CLUSTER_BLOCK

——————— ———- ——————

HASH_CLUSTER_TABLE_4K
1
138

HASH_CLUSTER_TABLE_4K
2
139

HASH_CLUSTER_TABLE_4K
3
140

HASH_CLUSTER_TABLE_4K
4
141

HASH_CLUSTER_TABLE_4K
5
137

HASH_CLUSTER_TABLE_8K
1
130

HASH_CLUSTER_TABLE_8K
2
131

HASH_CLUSTER_TABLE_8K
3
132

HASH_CLUSTER_TABLE_8K
4
133

HASH_CLUSTER_TABLE_8K
5
129

– Try to access record with a key value –
single I/O is needed (cr=1) as hash access is used

SQL>conn / as sysdba


alter session set tracefile_identifier=’hash_cluster';


alter session set sql_trace=true;

       


select * from hash_cluster_tab_rowsize_8k where id = 1;


select * from hash_cluster_tab_rowsize_4k where id = 1;


alter session set sql_trace=false;


col value for a75


select value from v$diag_info where upper(name) like ‘%TRACE FILE%';

VALUE

—————————————————————————

/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14452_hash_cluster.trc

ho
rm hash_cluster.out


ho tkprof
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_14452_hash_cluster.trc
hash_cluster.out


ho vi hash_cluster.out

********************************************************************************

select * from

hash_cluster_tab_rowsize_8k where id = 1

call     count
cpu    elapsed
disk      query
current        rows

——- ——  ——– ———- ———- ———-
———-  ———-

Parse
1      0.00
0.00
0          2
0           0

Execute      1
0.00
0.00
0
0
0           0

Fetch
2      0.00
0.00
0         
0           1

——- ——  ——– ———- ———- ———-
———-  ———-

total
4      0.00
0.01
0          3
0           1

Rows     Row Source Operation

——-  —————————————————

1  TABLE ACCESS HASH
HASH_CLUSTER_TAB_ROWSIZE_8K (cr=1 pr=0
pw=0 time=0 us)

********************************************************************************

select * from

hash_cluster_tab_rowsize_4k where id = 1

call     count
cpu    elapsed
disk      query
current        rows

——- ——  ——– ———- ———- ———-
———-  ———-

Parse
1      0.00
0.00
0          2
0           0

Execute      1      0.00
0.00
0
0
0           0

Fetch
2      0.00
0.00
0          1  
0           1

——- ——  ——– ———- ———- ———-
———-  ———-

total
4      0.00       0.00
0
3
0           1

Rows     Row Source Operation

——-  —————————————————

1  TABLE ACCESS HASH
HASH_CLUSTER_TAB_ROWSIZE_4K (cr=1 pr=0
pw=0 time=0 us)

********************************************************************************

– Insert records for 6th key values although
provision has made for 5 keys only

SQL> insert into
hash_cluster_tab_rowsize_8k values (6,  ‘x’, ‘x’, ‘x’);


insert into hash_cluster_tab_rowsize_4k values (6,
‘x’, ‘x’);


commit;

– Let’s find out the blocks the records have
gone to .

–    In hash_cluster_tab_rowsize_4k, id = 6 is hashed
to one of the existing hash values and record is placed in the same block (138)
as the earlier containing record with same hash key as two records can fit one
block. Hence block 138 becomes overflow block as it contains more hashvalues
than it is configured for.

Here id = 6 has been mappped to same hash
value as id = 1 and has occupied the same block as id = 1

–    In hash_cluster_tab_rowsize_8k, since one row can
fit one block, Newly added rows have  gone to a new  block (147) as
the earlier blocks can’t accommodate new rows.

But it has again been mapped to one of the
already existing hash values and is chained to the blocks containing those key
values.

SQL>select ‘HASH_CLUSTER_TABLE_8K’
TAB_NAME , id, dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block


from hash_cluster_tab_rowsize_8k t1

    union

select ‘HASH_CLUSTER_TABLE_4K’, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block


from hash_cluster_tab_rowsize_4k t2

     order by 1,2,3;

TAB_NAME
ID HASH_CLUSTER_BLOCK

——————— ———- ——————

HASH_CLUSTER_TABLE_4K        
1
138

HASH_CLUSTER_TABLE_4K
2
139

HASH_CLUSTER_TABLE_4K
3
140

HASH_CLUSTER_TABLE_4K
4
141

HASH_CLUSTER_TABLE_4K
5
137

HASH_CLUSTER_TABLE_4K        
6
138

HASH_CLUSTER_TABLE_8K
1
130

HASH_CLUSTER_TABLE_8K
2
131

HASH_CLUSTER_TABLE_8K
3
132

HASH_CLUSTER_TABLE_8K
4
133

HASH_CLUSTER_TABLE_8K
5
129

HASH_CLUSTER_TABLE_8K        
6
147

– Let’s add another record for id = 1

In hash_cluster_tab_rowsize_2k,  although we have space for another row
in blocks containing id = 2,3,4,5 , the new record goes to a new block(142) as
one block can contain only one hashvalue.  From now onwards, a block which
contains id = 1 may also contain an entry for id=6 and vice versa since both of
them hash to the same value.

In hash_cluster_tab_rowsize_8k also , the new  record goes to a new
block (135) as a block can contain only one row. Presuming that id = 6 and 1
hash to the same value, blocks containing id = 1 (130 and 135) have been
chained to block containing id=6 (135).

From now onwards, when we search for id =1 or id = 6 we will have to scan a
larger no. of blocks.Hence, if actual no. of cluster keys exceeds the specified
value for HASHKEYS,  the likelihood of a collision (two cluster
key values having the same hash value) increases and performance degrades.

SQL>

insert into
hash_cluster_tab_rowsize_8k values (1,  ‘x’, ‘x’, ‘x’);

     insert into
hash_cluster_tab_rowsize_4k values (1,  ‘x’, ‘x’);

     commit;

   

  select ‘HASH_CLUSTER_TABLE_8K’
TAB_NAME , id, dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block


from hash_cluster_tab_rowsize_8k t1

    union all

select ‘HASH_CLUSTER_TABLE_4K’, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block


from hash_cluster_tab_rowsize_4k t2

     order by 1,2,3;

TAB_NAME
ID HASH_CLUSTER_BLOCK

——————— ———- ——————

HASH_CLUSTER_TABLE_4K          1
138

HASH_CLUSTER_TABLE_4K          1
142

HASH_CLUSTER_TABLE_4K
2
139

HASH_CLUSTER_TABLE_4K
3
140

HASH_CLUSTER_TABLE_4K
4
141

HASH_CLUSTER_TABLE_4K
5
137

HASH_CLUSTER_TABLE_4K          6
138

HASH_CLUSTER_TABLE_8K          1
130

HASH_CLUSTER_TABLE_8K          1
135

HASH_CLUSTER_TABLE_8K
2
131

HASH_CLUSTER_TABLE_8K
3
132

HASH_CLUSTER_TABLE_8K
4
133

HASH_CLUSTER_TABLE_8K
5
129

HASH_CLUSTER_TABLE_8K        
6
134

Therefore, the distribution of rows in a hash cluster is directly
controlled by the value set for the HASHKEYS parameter. With a larger number of
hash keys for a given number of rows, the likelihood of
a collision (two cluster key values having the same hash value)
decreases. Minimizing the number of collisions is important because overflow
blocks (thus extra I/O) might be necessary to store rows with hash values that
collide.

Now let’s play around with SIZE clause...

The maximum number of hash keys assigned per data block is determined by the
SIZE parameter of the CREATE CLUSTER command. SIZE is an estimate of the total
amount of space in bytes required to store the average number of rows associated
with each hash value. For example, if the available free space per data block
is 1700 bytes and SIZE is set to 500 bytes, three hash keys (round(1700/500))
are assigned per data block.

Note: The importance of the SIZE parameter of hash clusters is analogous
to that of the SIZE parameter for index clusters. However, with index clusters,
SIZE applies to rows with the same cluster key value instead of the same hash
value.

Although the maximum number of hash key values per data block is determined
by SIZE, Oracle does not actually reserve space for each hash key value in the
block. For example, if SIZE determines that three hash key values are allowed
per block, this does not prevent rows for one hash key value from taking up all
of the available space in the block. If there are more rows for a given hash
key value than can fit in a single block, the block is chained, as necessary.

– Create 3 clusters with size = 2K, hashkeys
= 4  

HSH_CLUSTER_SIZ_2K_ROW_1K,  HSH_CLUSTER_SIZ_2K_ROW_2Kand
HSH_CLUSTER_SIZ_2K_ROW_4K

— Available free space is slightly less that blocksize of 8K

— No. of hashkeys that can be stored in the block < 4 (blocksize/size =
8k/2K)

= 3

SQL>  drop cluster HSH_CLUSTER_SIZ_2K_ROW_1K
including tables;


create cluster HSH_CLUSTER_SIZ_2K_ROW_1K


( id number(2) )


size 2K single table hash is id hashkeys 4 tablespace mssm;

      drop cluster
HSH_CLUSTER_SIZ_2K_ROW_2Kincluding tables;


create cluster hash_cluster_size_2k_row_2k


( id number(2) )


size 2K single table hash is id hashkeys 4 tablespace mssm;

     drop cluster
HSH_CLUSTER_SIZ_2K_ROW_4K including tables;


create cluster HSH_CLUSTER_SIZ_2K_ROW_4K


( id number(2) )


size 2K single table hash is id hashkeys 4 tablespace mssm;

– Create a table HSH_TAB_SIZ_2K_ROWSIZ_1K in
cluster HSH_CLUSTER_SIZ_2K_ROW_1K with row size such that row size =1/2(
specified size) = 1/2(2K) = 1K 

SQL>drop table HSH_TAB_SIZ_2K_ROWSIZ_1K
purge;

CREATE TABLE HSH_TAB_SIZ_2K_ROWSIZ_1K

(
id          number(2) ,


txt1        char(400),


txt2        char(400)

)

CLUSTER HSH_CLUSTER_SIZ_2K_ROW_1K( id );

– Create a table HSH_TAB_SIZ_2K_ROWSIZ_2K in
cluster HSH_CLUSTER_SIZ_2K_ROW_2Kwith row size such that row size =
specified size 2K  

SQL>drop table HSH_TAB_SIZ_2K_ROWSIZ_2K
purge;

CREATE TABLE HSH_TAB_SIZ_2K_ROWSIZ_2K

(
id          number(2) ,


txt1        char(900),


txt2        char(900)

)

CLUSTER HASH_CLUSTER_SIZE_2K_row_2k( id );
– Create a table HSH_TAB_SIZ_2K_ROWSIZ_4K in
cluster HSH_CLUSTER_SIZ_2K_ROW_4K with row size such that row size = 2(
specified size) = 2(2K) = 4K 

SQL>drop table HSH_TAB_SIZ_2K_ROWSIZ_4K
purge;

CREATE TABLE HSH_TAB_SIZ_2K_ROWSIZ_4K

(
id          number(2) ,


txt1        char(1900),


txt2        char(1900)

)

CLUSTER HSH_CLUSTER_SIZ_2K_ROW_4K( id );

– Insert records for 5 distinct key values in
the three tables

SQL>Begin

    for i in 1..5 loop

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_1K values (i,  ‘x’, ‘x’);

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_2K values (i,  ‘x’, ‘x’);

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_4K values (i,  ‘x’, ‘x’);

       end
loop;

     commit;

    end;

/

– Check that each block contains records for
a maximum of 3 hashkeys.   

For row size = 1K

8 rows can fit one block

Max no. of hash keys per block = 3

Max no. of records of distinct key values per block = 3

Records for id = 1,2 and 5 (3 hash keys) go to block 161

Records for id = 3 and 4 (only records left) go to block
162

 – For row size = 2K

3 rows can fit one block

Max no. of hash keys per block = 3

Max no. of records of distinct key values per block = 3

Records for id = 1,2 and 5 (3 hash keys) go to block 169

Records for id = 3 and 4 (only records left) go to block
170

 – For row size = 4K

1 row can fit one block

Max no. of hash keys per block = 3

Max no. of records possible per block = 1

Records for each id  go to different block

SQL>select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block


from HSH_TAB_SIZ_2K_ROWSIZ_1K t1

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block


from HSH_TAB_SIZ_2K_ROWSIZ_2K t2

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t3.rowid) hash_cluster_block


from HSH_TAB_SIZ_2K_ROWSIZ_4K t3  

     order by 1,2,3;

TABLE_NAME
ID HASH_CLUSTER_BLOCK

————————————- ———- ——————

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K          1
161

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K         
2
161

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
3
162

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K         
4
162

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K        
5
161

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
1
169

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K        
2
169

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K       
3
170

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
4
170

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K         
5
169

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
177

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
2
179

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
3
178

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
4
180

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
5
181

 — Let’s insert 3 records with id = 1 in
all the 3 tables 

- For row size = 1K

No. of rows already there = 3 in block 161 containing id = 1

Since 8 rows can fit one block, all the 3 rows inserted go to
block 161

Now block 161 has 6 rows

i.e. although the maximum number of hash key values per data
block as determined by SIZE is 3, Oracle does not actually reserve space for
each hash key value in the block. Rows for one hash key value(id=1) can take up
all of the available space in the block.

 – For row size = 2K

3 rows can fit one block

Block 169 containing id=1 already has 3 rows

Although 1 more row can fit in block 170 , newly added
rows for id=1 go to anothet block 172

i.e. when key value is inserted for the first time, its
blockmate keys are decided or let’s say that it is     decided
records for which hash values will stay together. From then onwards, records
for its blockmate keys will always reside on the same block. Hence, once hash
values occupying a block have been grouped (1,2,5 and 3,4) further records will
occupy only the blocks along with their blockmates.

Block 172 is chained to block 169 as both contain id=1.

 – For row size = 4K

1 row can fit one block

Each newly added record goes to new block

All the blocks containing id=1 (177, 184, 185, 186) will
be chained

SQL>select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_1K t1

    group by id,
dbms_rowid.rowid_block_number(t1.rowid)

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_2K t2

    group by id,
dbms_rowid.rowid_block_number(t2.rowid)

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t3.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_4K t3 

    group by id,
dbms_rowid.rowid_block_number(t3.rowid)

     order by 1,3,2;

TABLE_NAME
ID HASH_CLUSTER_BLOCK   COUNT(*)

————————————- ———- ——————
———-

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
1
161          4

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
2
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K         
5
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
3
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
4
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
1
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
2
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
5
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
3
170          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
4
170          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K         
1
172          3

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
177          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
3
178          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
2
179          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
4
180          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
5
181          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K         
1
184          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K         
1
185          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K         
1
186          1

– Add records for id’s = 6 to 9 i.e. four
keys more than what we have defined the cluster for (5).

— Note that it takes time to add these records

SQL>Begin

    for i in 6..9 loop

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_1K values (i,  ‘x’, ‘x’);

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_2K values (i,  ‘x’, ‘x’);

      insert into
HSH_TAB_SIZ_2K_ROWSIZ_4K values (i,  ‘x’, ‘x’);

       end
loop;

     commit;

    end;

/

– Check that there is collision for hash keys
- Multiple key values correspond to the same  hash key – the
hash chain for a hash key becomes longer and contains records with different
key values.

– Check that overallocation takes place i.e.
a block holds rows for hashkeys more than it is expected to hold (3).

 – For row size = 1K

In block 161, no. of rows already there = 6  containing id
= 1

Since 8 rows can fit one block, rows for id = 6 and 7 go to block
161 i.e. now block 161 contains records for id = 1,2,5,6,7

i.e. it contains 2 hashkeys more than it has configured to hold
. This is called overallocation and block 161 is called overflow block.

From now onwards whenever a record is added with any of id =
1,2,5,6,7, it will be chained to block 161   .

All the blocks chained to block 161  will be searched whenever any of
these id’s is searched.

 – For row size = 2K

3 rows can fit one block

Block 169 containing id=1 already has 3 rows with id =
1,2,5

Block 172 has 3 rows with id = 1 – It is already full –
it is chained to block 169

Block 170 has 2 rows with id = 3,4

Record for id=8 goes to block 170 i.e. from now onwards all
the blocks containing of id=3,4,8 will be chained to block 170.

All the blocks chained to block 170  will be
searched whenever any of these id’s (3,4,8) is searched.

Rows with id’s 7 and 9  go to block 173 . This block
will be chained with block 172 or 170

 – For row size = 4K

1 row can fit one block

Each newly added record goes to new block

Blocks with id’s > 5 will be chained with any of the
earlier blocks with matching hash value.

e.g. mod(1,5) = mod (6,5) = 1

Hence block containing id = 6 i.e. block 183  will
chain with the all the blocks containing id = 1 (177, 184, 185, 186)

i.e. to search for id = 6, 5 blocks will be searched.

– With a larger number of hash keys for a given number of rows, the
likelihood of a collision (two cluster key values having the same
hash value) decreases.

SQL>select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t1.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_1K t1

    group by id,
dbms_rowid.rowid_block_number(t1.rowid)

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t2.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_2K t2

    group by id,
dbms_rowid.rowid_block_number(t2.rowid)

    union

    select
‘HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K’ TABLE_NAME, id,
dbms_rowid.rowid_block_number(t3.rowid) hash_cluster_block, count(*)


from HSH_TAB_SIZ_2K_ROWSIZ_4K t3 

    group by id,
dbms_rowid.rowid_block_number(t3.rowid)

     order by 1,3,2;

TABLE_NAME
ID HASH_CLUSTER_BLOCK   COUNT(*)

————————————- ———- ——————
———-

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
1
161          4

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
2
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
5
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K         
6
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
7
161          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
3
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
4
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K
8
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_1K         
9
162          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
1
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
2
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
5
169          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
3
170          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
4
170          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K         
8
170          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
6
171          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
1
172          3

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
7
173          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_2K
9
173          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
177          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
3
178          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
2
179          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
4
180          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
5
181          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
9
182          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
6
183          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
184          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
185          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
1
186          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K
7
187          1

HASH_CLUSTER_TABLE_SIZE_2K_ROWSIZE_4K         
8
188          1

Specifying a User-Defined Hash Function

You can also specify any SQL expression as the hash function for a hash
cluster. If your cluster key values are not evenly distributed among the
cluster, you should consider creating your own hash function that more
efficiently distributes cluster rows among the hash values.

For example, if you have a hash cluster containing employee information and
the cluster key is the employee’s home area code, it is likely that many
employees will hash to the same hash value. To alleviate this problem, you can
place the following expression in the HASH IS clause of the CREATE CLUSTER
command:

MOD((emp.home_area_code + emp.home_prefix + emp.home_suffix), 101)

The expression takes the area code column and adds the phone prefix and
suffix columns, divides by the number of hash values (in this case 101), and
then uses the remainder as the hash value. The result is cluster rows more
evenly distributed among the various hash values.

CREATE CLUSTER address

(postal_code NUMBER, country_id CHAR(2))

HASHKEYS 20

HASH IS MOD(postal_code + country_id, 101);

Summary:

— Hash Cluster tables are appropriate for data that is read frequently via
an equality comparison on the key. If an index scan is used for a key
value, as more no. of users search for the same record, they hit the same index
block which becomes the “hot block” leading to more contention for
the cache buffers chains (cbc) latch. Replacing indexed tables with hash cluster
tables in this case can resolve the problem of contention for CBC latches.

— you cannot range scan a table in a hash cluster without adding a
conventional index to the table.  In an index cluster, the query for range of values will  be able to make use of the cluster key index to find these rows. In a hash cluster, this query would result in a full table scan unless you had an index on the key column. Only exact equality searches (including in lists and subqueries) may be made on the hash key without using an index that supports range scans.

— When you create a hash cluster  table, you must determine in advance,  the number of hash keys your table will ever have. The number of HASHKEYs in a hash cluster is a fixed size. You cannot change the size of the hash table without a rebuild of the cluster. This does not in any way limit the amount of data you can store in this cluster; it simply limits the number of unique hash keys that can be generated for this cluster. That may affect performance due to unintended hash collisions if the value was set too low. Getting the size of the HASHKEYs and SIZE parameters right is crucial to avoid a rebuild.

— With a hash cluster, the tables will start out big and will take longer
to create, as Oracle must initialize each block, an action that normally takes
place as data is added to the table. They have the potential to have data in
their first block and their last block, with nothing in between. Full scanning
a virtually empty hash cluster will take as long as full scanning a full hash cluster.

• The hash cluster is allocated right from the beginning. Oracle will take
your HASHKEYS/ trunc(blocksize/SIZE) and allocate and format that space right away. As soon as the first table is put in that cluster, any full scan will hit every allocated block. This is different from every other table in this respect.

• Updates to hash cluster tables do not introduce significant overhead,
unless you update the HASHKEY, which would not be a good idea, as it would
cause the row to migrate

. Hash clusters allocate all the storage for all the hash buckets when the cluster is created, so they may waste space.. Full scans on single table hash clusters will cost as much as they would in a heap table.

In my post Clustering Factor Demystified : Part – III, I have demonstrated the use of single table index and hash clusters to improve the clustering factor of an unorganized table.

——————————–

References :

http://www.relationaldbdesign.com/extended-database-features/module3/oracle-hash-cluster.php

http://docs.oracle.com/cd/B19306_01/server.102/b14231/hash.htm

http://simpleoracledba.blogspot.in/2009/05/oracle-single-table-hash-clusters-are.html

http://jonathanlewis.wordpress.com/2009/10/05/hash-clusters-2/

http://www.iselfschooling.com/mc4articles/mc4cluster.htm

http://dbaforums.org/oracle/index.php?showtopic=18947

http://www.orafaq.com/tuningguide/bad%20hash%20cluster.html

http://www.riddle.ru/mirrors/oracledocs/server/scn73/ch507.html

http://www.dba-oracle.com/t_hash_tables_clusters.htm

http://docs.oracle.com/cd/E11882_01/server.112/e25789/tablecls.htm

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:707440500346653259

http://stackoverflow.com/questions/2714099/oracle-hash-cluster-overflow-blocks

http://www.orafaq.com/forum/t/141142/2/

https://forums.oracle.com/forums/thread.jspa?threadID=2512488&start=0&tstart=0

———————————————————————————————————–

Related links :

Home

Database Index

Tuning Index

Single Table Index Clusters

Clustering Factor Demystified : Part – I

Clustering Factor Demystified : Part – II

Clustering Factor Demystified : Part – III

 

 

——————-

 

FLUSH BUFFER CACHE

                   

  In this post I will demonstrate that flushing the buffer cache causes dirty blocks to be written to disk .
– Create simple table whose one row occupies one block
HR1> drop table t1 purge;
    create table t1
    (c1 int, c2 char(2000), c3 char(2000), c4 char(2000));
– Insert one row
HR1> insert into t1 values(1, ‘pre update’, ‘x’, ‘x’);
     commit;
– Get the block address and browse X$BH view for that block.
HR1> select dbms_rowid.rowid_relative_fno(rowid) as file#,
           dbms_rowid.rowid_block_number(rowid) as block#
    from t1;
 
File#         Block#
——        ——
    4        550
– Check in x$bh that We have one CURRENT block(state=xcur).
   Enter file#/block# found above when prompted
SYS1>select b.dbarfil, b.dbablk, b.class,
  decode(state,0,’free’,1,’xcur’,2,’scur’,3,’cr’,
4,’read’,5,’mrec’,6,’irec’,7,’write’,8,’pi’, 9,’memory’,10,’mwrite’,11,’donated’) as state,
  cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq,
  (select object_name from dba_objects where object_id = b.obj) as object_name
from sys.x$bh b
where dbarfil = &file_no and
      dbablk = &block_no;
Enter value for file_no: 4

old   6: where dbarfil = &file_no and

new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK
CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———-
———- ——————–
         4        550          1 xcur                0          0          0          0
   0 T1
– Update the table so that the block becomes dirty.
   Do not commit the transaction so that info about the change is not written to redo logs.
HR1>update t1 set c2=’updated';
– Check that two versions(current and cr clone) of the block are there in buffer cache
SYS1>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
  DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK
CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———-
———- ——————–
         4        550          1 xcur                0          0          0          0
   0 T1
         4        550          1 cr           24279860          0          0          0
   0 T1
– Flush the buffer cache –
SYS2>alter system flush buffer_cache;
– Check that there are not any versions of the block (state=free) in buffer cache
SYS1>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK
CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———-
———- ——————–
         4        550          1 free                0          0          0          0
   0 T1
         4        550          1 free                0          0          0          0
   0 T1
– Query the table from the same session from where update was issued
   Autotrace the query to verify that blocks were read from the disk (physical reads > 0)
   Note that the updated row is displayed.
HR1>set autotrace on
    col c2 for a20;
    select c2 from t1;
C2
——————–
updated
Statistics
———————————————————-
          0  recursive calls
          0  db block gets
          7  consistent gets
          6  physical reads
          0  redo size
       2423  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
– Check that current version of the block (state=xcur) has been read into the buffer cache
SYS1>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK
CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———-
———- ——————–
         4        550          1 xcur                0          0          0          0
   0 T1
         4        550          1 free                0          0          0          0
   0 T1
– Query the table from another session. This session should see the pre update value of the
column.
   Check that physical reads = 0 as current block is already there in buffer cache
              Consistent reads > 0 as undo tablespace has been read to construct cr block
HR2>set autotrace on
    col c2 for a20
    set line 80
    select c2 from t1;
C2
——————–
pre update
Statistics
———————————————————-
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
        108  redo size
       2423  bytes sent via SQL*Net to client
        419  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
– Check that cr version of the block has been created in buffer cache (state=cr)
   after reading the undo blocks (CR_UBA_FIL = 3, CR_UBA_BLK = 189980)
SYS1>/
Enter value for file_no: 4
old   6: where dbarfil = &file_no and
new   6: where dbarfil = 4 and
Enter value for block_no: 550
old   7:       dbablk = &block_no
new   7:       dbablk = 550
   DBARFIL     DBABLK      CLASS STATE      CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK
CR_UBA_SEQ OBJECT_NAME
———- ———- ———- ———- ———- ———- ———- ———-
———- ——————–
         4        550          1 cr           24294089          0          3     189980
 941 T1
         4        550          1 xcur                0          0          0          0
   0 T1
Conclusion:
     Flushing the buffer cache causes dirty blocks to be written to disk .
———————————————————————————————————-
Related links:

 

CURSOR SHARING DEMYSTIFIED

As I discussed in my earlier post on parent and child cursors, multiple child cursors may be created for the same parent cursor if bind variables have different values.
In this post I will discuss about the parameter CURSOR_SHARING which controls the sharing of child cursors if bind variables have different values.

The parameter CURSOR_SHARING can take 3 values :

  •  – EXACT
  •  – SIMILAR
  •  – FORCE

Let’s see the impact of different values :

CURSOR_SHARING = EXACT

– In this case when the same statement is issued with different literals, multiple parent cursors will be created.

— create a test table with

1 record with id1 = id2 = 1

1000 records with id1 = id2 = 2

2000 records with id1 = id2= 3

— create an index on the table

HR> drop table test purge;
create table test (id1 number, id2 number, txt char(1000));
insert into test values (1,1, ‘one’);

begin
for i in 1..1000 loop
insert into test values (2,2, ‘two’);
insert into test values (3,3, ‘three’);
end loop;
end;
/

    insert into test select * from test where id1=3;
commit;

    create index test_idx1 on test(id1);
create index test_idx2 on test(id2);

    select id1,id2, count(*)
from test
group by id1,id2;


CURSOR_SHARING=EXACT 

 

Parent   Parent  Parent
       |               |             |
 Child     Child  Child
 

— Flush the shared pool

Set cursor_sharing=exact

SYS>alter system set CURSOR_SHARING=’EXACT';
alter system flush shared_pool;
sho parameter CURSOR_SHARING

NAME                                 TYPE        VALUE
—–                                        —-        —–

cursor_sharing                       string      EXACT
— Issue identical statements with different values of literals

HR>conn hr/hr
select count(*) from test where id1=1;
select count(*) from test where id1=2;
select count(*) from test where id1=3;

— Check that the 3 parent cursors have been created

— Note that   there is one record for each statement in v$sqlarea as   one parent cursor is created for each sql statement since  each of these statements differ in their text.

  •     Each statement has different SQL_ID/HASH_VALUE
  •    There is one child per parent cursor (version_count=1)
  •     Execution plans for id = 2,3 is same (full table scan) (same PLAN_HASH_VALUE)
  •     Execution plan for id = 1 is different (indexed access)

SYS>col sql_text for a30 word_wrapped

         SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE    PLAN_HASH_VALUE
—————————— ————- ————- ———-               —————

select count(*) from test      1n09m564gh0q3          1  2297955011       4192825871
where id1=3

select count(*) from test      20nhaap8uxf7s             1   1370405112       3507950989
where id1=2

select count(*) from test      bavqx2mw26wg0         1  4163072480    3507950989
where id1=1

— Note that 3 child cursors have been created for the 3 statements

SYS>col child_number for 99
         SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE, 
                        PLAN_HASH_VALUE
         FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE    PLAN_HASH_VALUE

—————————— ————- ———- ———-   —————

select count(*) from test      1n09m564gh0q3          0 2297955011   4192825871
where id1=3

select count(*) from test      20nhaap8uxf7s          0 1370405112       3507950989
where id1=2

select count(*) from test      bavqx2mw26wg0          0 4163072480   3507950989
where id1=1

— We can see that in all 6 cursors have been created :
– 3 parent cursors and
– 3 child cursors

Each of the cursor occupies memory. Parent cursors contain sql text whereas
child cursor contains execution plan, execution statistics and execution
environment. If we replace literal with a bind variable, all the 3 statements
will be identical and hence only parent cursor needs to be created. Multiple
child cursors can be created for different values of the bind variables.

That’s what CURSOR_SHARING=SIMILAR does. It replaces literals in the otherwise
identical SQL statements with bind variables and only one parent cursor is
created.

If histogram on a column is created with only one bucket,i.e. it does not know about the skew
in data, only one child cursor will be created.

If histogram is created on a column with >1 buckets i.e. it knows about skew in data in that
column, it  will create one child cursor for each statement even of the execution plan is same.

Thus CURSOR_SHARING=SIMILAR reduces the no. parent cursors.

If there is skew in data
If histogram on the column containing skewed data is there
multiple child cursors may be created – one for each value of the bind variable
else (histogram is not available)
only one child cursor will be created.
else (Data is not skewed)
only one child cursor will be created.

Now, since there is identical skewed data in id1 and id2 , we will create histogram  on id1
with one bucket and on id2 with 4 buckets and see the difference.

CURSOR_SHARING=SIMILAR  WITHOUT HISTOGRAM

Parent  
  |        
 Child    

— create histogram only on id1 with one bucket so that optimizer does not
know about the skew —

HR>exec dbms_stats.gather_table_stats(OWNNAME => ‘HR’,-
TABNAME => ‘TEST’,-
ESTIMATE_PERCENT =>null,-
METHOD_OPT => ‘FOR COLUMNS SIZE 1 ID1′);

— Set cursor_sharing = similar —

— Flush the shared pool

SYS>alter system set CURSOR_SHARING=’SIMILAR';
alter system flush shared_pool;
sho parameter CURSOR_SHARING

— Issue identical statements with different values of literals for the column on which histogram is not there (id1)

HR>conn hr/hr
select count(*) from test where id1=1;
select count(*) from test where id1=2;
select count(*) from test where id1=3;

— Check that the only 1 parent cursor has been created and literal has been replaced by bind variable. ( 1 record in v$SQLAREA)

.There is only one child  cursor (version_count=1) since the optimizer does not know about skew in data

SYS>col sql_text for a30 word_wrapped

         SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE    PLAN_HASH_VALUE

—————————— ————- ————- ———-  —————

select count(*) from test      07tpk6bm7j4qm             1   3866661587   3507950989
where id1=:”SYS_B_0″

— Note there is only one child cursor created i.e. same execution plan will be used for different values of the bind variable

SYS>col child_number for 99
         SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE,
                       PLAN_HASH_VALUE
         FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE  PLAN_HASH_VALUE

—————————— ————- ———- ———-  —————

select count(*) from test      07tpk6bm7j4qm          0   3866661587    3507950989
where id1=:”SYS_B_0″

CURSOR_SHARING=SIMILAR  WITH HISTOGRAM

               Parent

                   +
   +—- —+——–+
   |                |               |
  Child    Child    Child

— create histogram  on id2 with  4 buckets so that optimizer knows about  the skew in data —

HR>exec dbms_stats.gather_table_stats(OWNNAME => ‘HR’,-
TABNAME => ‘TEST’,-
ESTIMATE_PERCENT =>null,-
CASCADE => TRUE,-
METHOD_OPT => ‘FOR COLUMNS SIZE 4 ID2′); 

— Issue identical statements with different values of literals for the column  on which histogram is there (id2)

SYS>alter system flush shared_pool;

HR>conn hr/hr
select count(*) from test where id2=1;
select count(*) from test where id2=2;
select count(*) from test where id2=3;

— Check that the only 1 parent cursor has been created and literal has been replaced by bind variable. ( 1 record in v$SQLAREA)
.There are 3 child cursors (version_count=3)

SYS>col sql_text for a30 word_wrapped

         SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE  PLAN_HASH_VALUE

—————————— ————- ————- ———-  ————–

select count(*) from test      3tcujqmqnqs8t             3   3981140249  2432738936
where id2=:”SYS_B_0″

— Note that 3 child cursors have been created as optimizer realizes that data is skewed and different execution plans will be more efficient for different values of the bind variable.
—  2 children have same execution plan (PLAN_HASH_VALUE)      (for id=2 and 3 (Full table scan )

SYS>col child_number for 99
    SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE,  
                   PLAN_HASH_VALUE
     FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE  PLAN_HASH_VALUE

—————————— ————- ———- ———-   ————–

select count(*) from test      3tcujqmqnqs8t          0   3981140249   2432738936
where id2=:”SYS_B_0″

select count(*) from test      3tcujqmqnqs8t          1   3981140249   2432738936
where id2=:”SYS_B_0″

select count(*) from test      3tcujqmqnqs8t          2 3981140249   1489241381
where id2=:”SYS_B_0″

Hence, it can be seen that setting CURSOR_SHARING=SIMILAR
– replaces literals with bind variables in otherwise identical sql statements

  • - Only one child cursor is created if optimizer does not know about skew in   data
  • - If optimizer is aware of the skew in data, Multiple child cursors are created   for each distinct value of the bind   variable even if they have the same   executiion plan.

Ideally we would like one child cursor to be created if execution plan is same for different values of the bind variable.

Setting CURSOR_SHARING=FORCE IN 11G does precisely this but only if the optimizer is

aware about the skew in the data. Let’s see:

CURSOR_SHARING=FORCE IN 11G WITHOUT HISTOGRAM

Parent  

     |        
 Child    
– Flush the shared pool and issue query using the column without histogram on
it so that optimizer is not aware of the skew.
SYS>alter system set CURSOR_SHARING=’FORCE';

          alter system flush shared_pool;

HR>conn hr/hr
select count(*) from test where id1=1;
select count(*) from test where id1=2;
select count(*) from test where id1=3;

— Note that only one parent cursor is created

One child cursor has been created (version_count=1)

SYS>col sql_text for a30 word_wrapped
         SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
         FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE   PLAN_HASH_VALUE

—————————— ————- ————- ———-    ————–

select count(*) from test      07tpk6bm7j4qm             1   3866661587   3507950989
where id1=:”SYS_B_0″

— Note that 1 child cursor has been created

SYS>col child_number for 99
    SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE,
                   PLAN_HASH_VALUE
     FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE   PLAN_HASH_VALUE

—————————— ————- ———- ———-   —————

select count(*) from test      07tpk6bm7j4qm          0   3866661587    3507950989
where id1=:”SYS_B_0″

CURSOR_SHARING=FORCE IN 11G WITH HISTOGRAM
      Parent

           |

   +—+—-+
   |              |
  Child    Child

– Flush the shared pool and issue query using the column with histogram on it so that optimizer is aware of the skew.

SYS> alter system flush shared_pool;

HR>conn hr/hr
select count(*) from test where id2=1;
select count(*) from test where id2=2;
select count(*) from test where id2=3;

— Note that only one parent cursor is created

Two child cursors have been created (version_count=2)

SYS>col sql_text for a30 word_wrapped

         SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE   PLAN_HASH_VALUE
—————————— ————- ————- ———-   —————

select count(*) from test      3tcujqmqnqs8t             2   3981140249   2432738936
where id2=:”SYS_B_0″

— Note that 2 child cursors have been created and    each child has a distinct execution plan (PLAN_HASH_VALUE)

SYS>col child_number for 99
         SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE,  
                        PLAN_HASH_VALUE
         FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE  PLAN_HASH_VALUE

—————————— ————- ———- ———-  —————

select count(*) from test      3tcujqmqnqs8t          0  3981140249   2432738936
where id2=:”SYS_B_0″

select count(*) from test      3tcujqmqnqs8t          1  3981140249  1489241381
where id2=:”SYS_B_0″

Hence, setting CURSOR_SHARING=FORCE in 11g will use the same child cursor if   execution plan is same for different values of the bind variables which means  saving in memory in the shared pool and saving in the time for scanning the  hash chains in the library cache . This new feature of 11g is called ADAPTIVE CURSOR SHARING.

Note: The behaviour of CURSOR_SHARING=FORCE in 11g is different from 9i/10g. Earlier, it would peek the value of the bind variable during the first execution and decide on the eexcution plan. On subsequent execution of the same statement with different values of the bind variable, it would reuse the same plan irrespective of the skew in the data.

CURSOR_SHARING=FORCE IN 10G WITH/WITHOUT HISTOGRAM

Parent  

  |        

 Child    
  Let’s demonstrate this by simulating 10g optimizer by setting the parameter optimizer_geatures_enable to 10.2.0.0.

SYS> alter system set optimizer_features_enable=’10.2.0.3′;

— Flush the shared pool and issue query using the column with histogram on

it so that optimizer is aware of the skew.

SYS> alter system flush shared_pool;

HR>conn hr/hr
select count(*) from test where id2=1;
select count(*) from test where id2=2;
select count(*) from test where id2=3;

— Note that only one parent cursor is created

Only child cursor has been created (version_count=1)

SYS>col sql_text for a30 word_wrapped

    SELECT SQL_TEXT , SQL_ID, VERSION_COUNT, HASH_VALUE,PLAN_HASH_VALUE
FROM V$SQLAREA
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID        VERSION_COUNT HASH_VALUE PLAN_HASH_VALUE

—————————— ————- ————- ———- —————

select count(*) from test      3tcujqmqnqs8t             1 3981140249      2432738936

where id2=:”SYS_B_0″

— Note that 1 child cursor has been created

SYS>col child_number for 99

    SELECT SQL_TEXT, SQL_ID, CHILD_NUMBER CHILD#, HASH_VALUE, PLAN_HASH_VALUE
FROM V$SQL
WHERE LOWER(SQL_TEXT) LIKE ‘select count(*) from test%’
AND LOWER(SQL_TEXT) NOT LIKE ‘%HASH%';

SQL_TEXT                       SQL_ID            CHILD# HASH_VALUE PLAN_HASH_VALUE

—————————— ————- ———- ———- —————

select count(*) from test      3tcujqmqnqs8t          0 3981140249      2432738936

where id2=:”SYS_B_0″

– cleanup –

SYS>alter system set optimizer_features_enable=’11.2.0.1′;

         drop table hr.test purge;


CONCLUSION:

CURSOR_SHARING = EXACT

– Causes maximum memory usage in library cache as two cursors – one parent and one child cursor are created for each distinct value of the bind variable.

– Gives best performance as optimizer creates different execution plan for each value of the bind variable.

CURSOR_SHARING = SIMILAR
- Reduces memory usage in library cache as only one parent cursor is created .
- If data is not skewed or the optimizer is not aware of the skew, optimizer peeks at the value of the bind variable on the first execution of the statement and that plan is used for all the  values of the bind variable. Thus only one child cursor is created resulting in minimum memory usage by child cursors. In this case performance will be affected if there is skew in the data.
- If data is skewed and the optimizer is aware of the skew, multiple child cursor are created – one for each distinct value of the bind variable. In this case performance will be the best as optimizer creates different execution plan for each value of the bind variable. But in this case we will have multiple child cursors created for the same execution plan.
CURSOR_SHARING = FORCE IN 10g
- Causes minimum memory usage in library cache as only one parent cursor and only one child cursor are created .
- In this case performance will be affected if there is skew in the data.
CURSOR_SHARING = FORCE IN 11g (ADAPTIVE CURSOR SHARING)
- Reduces  memory usage in library cache as only one parent cursor and only one child cursor are created .
- If data is not skewed or the optimizer is not aware of the skew, optimizer peeks at the value of the bind variable on the first execution of the statement and that plan is used for all the  values of the bind variable. Thus only one child cursor is created resulting in minimum memory usage by child cursors. In this case performance will be affected if there is skew in the data. (same scenario as cursor_sharing=similar )
- If data is skewed and the optimizer is aware of the skew, multiple child cursor are created for different values of the bind variable – one for each distinct execution plan . In this case performance will be the best as optimizer creates different execution plans for different values of the bind variable. But in this case we will have only child cursor created for the same execution plan thereby resulting in optimum memory usage by child cursors.In my next post on Tuning Shared Pool , I will demonstrate how can we reduce hard parsing by- replacing literals with bind variables- setting cursor_sharing = similar

————————————————————————————
 
Related links: