Statistics for Partitioned Tables
Gathering statistics on partitioned tables consists of gathering statistics at both the table level and partition level. Prior to Oracle Database 11g, whenever a new partition was added, the entire table had to be scanned to refresh table-level statistics which could be very expensive, depending on the size of the table.
Incremental Global Statistics
With the introduction of incremental global statistics in 11g, the database, instead of performing a full table scan to compute global statistics, can derive global statistics from the partition level statistics. Some of the statistics, for example the number of rows, can be accurately derived by aggregating the values from partition statistics . However, the NDV of a column cannot be derived by aggregating partition-level NDVs. Hence, a structure called synopsis is maintained by the database for each column at the partition level which can be viewed as a sample of distinct values. The synopses for various partitions are merged by the database to accurately derive the NDV for each column.
Hence, when a new partition is added to a table, the database
- gathers statistics and creates synopses for the newly added partition,
- retrieves synopses for the existing partitions of the table and
- aggregates the partition-level statistics and synopses to create global statistics.
Thus, the need to scan the entire table to gather table level statistics on adding a new partition has been eliminated.
However, if partition exchange loads are performed and statistics for source table are available, statistics still need to be gathered for the partition after the exchange to obtain its synopsis.
Enhancements in Oracle 12c
Oracle Database 12c introduces new enhancements for maintaining incremental statistics. Now, DBMS_STATS can create a synopsis on a non-partitioned table as well. As a result, if you are using partition exchange loads, the statistics / synopsis for the source table will become the partition level statistics / synopsis after the load, so that the database can maintain incremental statistics without having to explicitly gather statistics on the partition after the exchange.
Let’s demonstrate …
Overview:
Source non-partitioned table : HR.SRC_TAB
Destination partitioned table: HR.PART_TAB
Destination partition : PMAR
— Create a partitioned table HR.PART_TAB with 3 partitions
- only 2 partitions contain data initially
- set preference incremental = true
- gather stats for the table – gathers statistics and synopses for 2 partitions
— create a non partitioned table HR.SRC_TAB which will used to load the 3rd partition using partition exchange
- Set table preferences for HR.SRC_TAB
- INCREMENTAL = TRUE
- INCREMENTAL_LEVEL = TABL
- Gather stats for the source table: DBMS_STATS gathers table-level synopses also for the table
— Perform the partition exchange
— After the exchange, the the new partition has both statistics and a synopsis.
— Gather statitstics for PART_TAB – Employs partition level statistics and synopses to derive global statistics.
Implementation
– Create and populate partitioned table part_tab with 3 partitions
PJAN, PFEB and PMAR
SQL>conn hr/hr
drop table part_tab purge;
create table part_tab
(MNTH char(3),
ID number,
txt char(10))
partition by list (mnth)
(partition PJAN values ('JAN'),
partition PFEB values ('FEB'),
partition PMAR values ('MAR'));
insert into part_tab values ('JAN', 1, 'JAN1');
insert into part_tab values ('JAN', 2, 'JAN2');
insert into part_tab values ('JAN', 3, 'JAN3');
insert into part_tab values ('FEB', 2, 'FEB2');
insert into part_tab values ('FEB', 3, 'FEB3');
insert into part_tab values ('FEB', 4, 'FEB4');
commit;
– Note that
- partition PMAR does not have any data
- there are 4 distinct values in column ID i.e. 1,2,3 and 4
select 'PJAN' Partition, mnth, id from part_tab partition (PJAN)
union
select 'PFEB' Partition, mnth, id from part_tab partition (PFEB)
union
select 'PMAR' Partition, mnth, id from part_tab partition (PMAR)
order by 1 desc;
PART MNT ID
---- --- ----------
PJAN JAN 1
PJAN JAN 2
PJAN JAN 3
PFEB FEB 2
PFEB FEB 3
PFEB FEB 4
– Set preference Incremental to true for the table part_tab
SQL>begin
dbms_stats.set_table_prefs ('HR','PART_TAB','INCREMENTAL','TRUE');
end;
/
select dbms_stats.get_prefs ('INCREMENTAL','HR','PART_TAB') from dual;
DBMS_STATS.GET_PREFS('INCREMENTAL','HR','PART_TAB')
----------------------------------------------------
TRUE
-- Gather statistcs for part_tab
SQL> exec dbms_stats.gather_table_stats('HR','PART_TAB');
– Note that global statistics have been gathered and the table has been analyzed at 16:02:31
SQL>alter session set nls_date_format='dd-mon-yyyy hh24:mi:ss';
col table_name for a12
select table_name, num_rows, last_analyzed from user_tables
where table_name='PART_TAB';
TABLE_NAME NUM_ROWS LAST_ANALYZED
------------ ---------- --------------------
PART_TAB 6 17-nov-2014 16:02:31
– A full table scan was performed and stats were gathered for each of the partitions
All the partitions have been analyzed at the same time as table i.e. at 16:02:31
SQL> col partition_name for a15
select partition_name, num_rows,last_analyzed
from user_tab_partitions
where table_name = 'PART_TAB' order by partition_position;
PARTITION_NAME NUM_ROWS LAST_ANALYZED
--------------- ---------- --------------------
PJAN 3 17-nov-2014 16:02:31
PFEB 3 17-nov-2014 16:02:31
PMAR 0 17-nov-2014 16:02:31
– NUM_DISTINCT correctly reflects that there are 4 distinct values in column ID
SQL> col column_name for a15
select TABLE_NAME, COLUMN_NAME, NUM_DISTINCT
from user_tab_col_statistics
where table_name = 'PART_TAB' and column_name = 'ID';
TABLE_NAME COLUMN_NAME NUM_DISTINCT
------------ --------------- ------------
PART_TAB ID 4
– Create source unpartitioned table SRC_TAB
– Populate SRC_TAB with records for mnth = MAR
and introduce two new values for column ID i.e. 0 and 5
SQL>drop table src_tab purge;
create table src_tab
(MNTH char(3),
ID number,
txt char(10));
insert into src_tab values ('MAR', 0, 'MAR0');
insert into src_tab values ('MAR', 2, 'MAR2');
insert into src_tab values ('MAR', 3, 'MAR3');
insert into src_tab values ('MAR', 5, 'MAR5');
commit;
– Set preferences for table src_tab
- INCREMENTAL = TRUE
- INCREMENTAL_LEVEL = TABLE
SQL>begin
dbms_stats.set_table_prefs ('HR','SRC_TAB','INCREMENTAL','TRUE');
dbms_stats.set_table_prefs ('HR','SRC_TAB','INCREMENTAL_LEVEL','TABLE');
end;
/
col incremental for a15
col incremental_level for a30
select dbms_stats.get_prefs ('INCREMENTAL','HR','SRC_TAB') incremental,
dbms_stats.get_prefs ('INCREMENTAL_LEVEL','HR','SRC_TAB') incremental_level
from dual;
INCREMENTAL INCREMENTAL_LEVEL
--------------- ------------------------------
TRUE TABLE
– Gather stats and synopsis for table SRC_TAB and note that table is analyzed at 16:06:03
SQL>exec dbms_stats.gather_table_stats('HR','SRC_TAB');
col table_name for a12
select table_name,num_rows, last_analyzed from user_tables
where table_name='SRC_TAB';
TABLE_NAME NUM_ROWS LAST_ANALYZED
------------ ---------- --------------------
SRC_TAB 4 17-nov-2014 16:06:33
– Exchange partition –
SQL>alter table part_tab exchange partition PMAR with table SRC_TAB;
– Note that table level stats for part_tab are still as earlier
as stats have not been gathered for it after partition exchange
SQL> col table_name for a12
select table_name, num_rows, last_analyzed from user_tables
where table_name='PART_TAB';
TABLE_NAME NUM_ROWS LAST_ANALYZED
------------ ---------- --------------------
PART_TAB 6 17-nov-2014 16:02:31
– NDV for col ID is still same as earlier i.e. 4 as stats
have not been gathered for table after partition exchange
SQL> col column_name for a15
select TABLE_NAME, COLUMN_NAME, NUM_DISTINCT
from user_tab_col_statistics
where table_name = 'PART_TAB' and column_name = 'ID';
TABLE_NAME COLUMN_NAME NUM_DISTINCT
------------ --------------- ------------
PART_TAB ID 4
– Note that stats for partition PMAR have been copied from
src_tab. Last_analyzed column for Pmar has been updated
and shows same value as for table src_tab i.e. 16:06:33
Also, num_rows are shown as 4
SQL> col partition_name for a15
select partition_name, num_rows,last_analyzed
from user_tab_partitions
where table_name = 'PART_TAB' order by partition_position;
PARTITION_NAME NUM_ROWS LAST_ANALYZED
--------------- ---------- --------------------
PJAN 3 17-nov-2014 16:02:31
PFEB 3 17-nov-2014 16:02:31
PMAR 4 17-nov-2014 16:06:33
– Gather stats for table part_tab
SQL>exec dbms_stats.gather_table_stats('HR','PART_TAB');
– While gathering stats for the table, partitions have not been
scanned as indicated by the same value as earlier in column LAST_ANALYZED.
SQL> col partition_name for a15
select partition_name, num_rows,last_analyzed
from user_tab_partitions
where table_name = 'PART_TAB' order by partition_position;
PARTITION_NAME NUM_ROWS LAST_ANALYZED
--------------- ---------- --------------------
PJAN 3 17-nov-2014 16:02:31
PFEB 3 17-nov-2014 16:02:31
PMAR 4 17-nov-2014 16:06:33
– Note that num_rows for the table part_tab has been updated by adding up the values from various partitions using partition level statistics
Column LAST_ANALYZED has been updated for the table
SQL> col table_name for a12
select table_name, num_rows, last_analyzed from user_tables
where table_name='PART_TAB';
TABLE_NAME NUM_ROWS LAST_ANALYZED
------------ ---------- --------------------
PART_TAB 10 17-nov-2014 16:11:26
– NDV for column ID has been updated to 6 using the synopsis for partition PMAR as copied from table src_tab
SQL> col column_name for a15
select TABLE_NAME, COLUMN_NAME, NUM_DISTINCT
from user_tab_col_statistics
where table_name = 'PART_TAB' and column_name = 'ID';
TABLE_NAME COLUMN_NAME NUM_DISTINCT
------------ --------------- ------------
PART_TAB ID 6
– We can also confirm that we really did use incremental statistics by querying the dictionary table sys.HIST_HEAD$, which should have an entry for each column in the PART_TAB table.
SQL>conn / as sysdba
col tabname for a15
col colname for a15
col incremental for a15
select o.name Tabname , c.name colname,
decode (bitand (h.spare2, 8), 8, 'yes','no') incremental
from sys.hist_head$ h, sys.obj$ o, sys.col$ c
where h.obj# = o.obj#
and o.obj# = c.obj#
and h.intcol# = c.intcol#
and o.name = 'PART_TAB'
and o.subname is null;
TABNAME COLNAME INCREMENTAL
--------------- --------------- ---------------
PART_TAB MNTH yes
PART_TAB ID yes
PART_TAB TXT yes
I hope this post was useful.
Your comments and suggestions are always welcome.
References:
http://oracle-randolf.blogspot.in/2012/01/incremental-partition-statistics-review.html
https://docs.oracle.com/database/121/TGSQL/tgsql_stats.htm#TGSQL413
https://blogs.oracle.com/optimizer/entry/incremental_statistics_maintenance_what_statistics
http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-statistics-concepts-12c-1963871.pdf
http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-optimizer-with-oracledb-12c-1963236.pdf
http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-bp-for-stats-gather-12c-1967354.pdf
https://blogs.oracle.com/optimizer/entry/maintaining_statistics_on_large_partitioned_tables
———————————————————————————-
Related Links:
Home
Database 12c Index
===================================================================
Like this:
Like Loading...