The optimizer can make use of explicitly defined Not NULL constraints to take advantage
of an index in order to avoid a full table scan since a B-tree index stores only not NULL values .
When count (constant) or count(*) is queried, we want to count no. of rows in the table. Hence , if there is a column which is defined as not NULL and has an index on it, the number of index entries in the index are bound to be same as the number of rows. The query optimizer uses the index to count no. of rows in the table.
Similarly, when a count (not-nullable-column) is queried, we want to count the no. of rows having not null values in the column. Since the column has a not NULL constraint on it, every row in the table will have a not null value in it and count(not-nullable-column) is same as count(*). As a result, the query optimizer can use the index on the column to process the query.
In fact, in both the cases above, any B-tree containing at least a not-nullable column can serve the purpose.
When a count (nullable-column) is queried, we want to count the no. of rows having not null values in the column. If we have an index on the column, the index will store only not NULL values and hence can be effectively used by the query optimizer to give the result.
In fact, the optimizer can use any index containing the nullable column for this purpose.
To demonstrate the above functionality, I have created a table HR.TEST with two columns – NOTNULL having not NULL constraint
– NULLABLE
. having same data as column NOTNULL but has not been declared not NULL
. has a B-tree index on it
SQL>drop table hr.test purge; create table hr.test (notnull number not null, nullable number); insert into hr.test select rownum, rownum from all_tables; create index hr.test_idx on hr.test(nullable); exec dbms_stats.gather_table_stats ('HR','TEST', cascade => true);
Now I will query count for various arguments and check if optimizer can use the index on NULLABLE column.
Note that to process count(*), count(1) and count(notnull), the query optimizer uses Full Table Scan. Although the column NULLABLE has non-null values in all the rows but since it has not been explicitly declared not null , the optimizer does not know that no. of entries in index reflect the count correctly and hence does not use the index .
SQL>select count(*) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 1mat065c25crk, child number 0 ------------------------------------- select count(*) from hr.test Plan hash value: 1950795681 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| TEST | 108 | 3 (0)| 00:00:01 | ------------------------------------------------------------------- SQL>select count(1) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID gzpsn7ff3ncmc, child number 0 ------------------------------------- select count(1) from hr.test Plan hash value: 1950795681 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| TEST | 108 | 3 (0)| 00:00:01 | ------------------------------------------------------------------- SQL>select count(notnull) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 6kxdzxbac62b4, child number 0 ------------------------------------- select count(notnull) from hr.test Plan hash value: 1950795681 ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| TEST | 108 | 3 (0)| 00:00:01 | -------------------------------------------------------------------
To process count(nullable), the optimizer uses index on column NULLABLE because we want to count not null values in column nullable and Btree index stores only not null values.
SQL> select count(nullable) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID bz8rxw5rmmv8g, child number 0 ------------------------------------- select count(nullable) from hr.test Plan hash value: 2983066704 -------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | 4 | | | | 2 | INDEX FAST FULL SCAN| TEST_IDX | 2784 | 11136 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------
Now I will declare not NULL constraint on column NULLABLE.
SQL> alter table hr.test modify (nullable not null);
Now if query count(*), count(1), count(notnull) and count(nullable), the optimizer is able to avoid Full Table Index by making use of the index on NULLABLE column in all the cases . Since the column NULLABLE having index has been declared not null and optimizer knows that entries in the index represent all the rows of the table.
SQL>select count(*) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 1mat065c25crk, child number 0 ------------------------------------- select count(*) from hr.test Plan hash value: 2983066704 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| TEST_IDX | 2784 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------- SQL>select count(1) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID gzpsn7ff3ncmc, child number 0 ------------------------------------- select count(1) from hr.test Plan hash value: 2983066704 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| TEST_IDX | 2784 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------- SQL>select count(notnull) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 6kxdzxbac62b4, child number 0 ------------------------------------- select count(notnull) from hr.test Plan hash value: 2983066704 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| TEST_IDX | 2784 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------- SQL> select count(nullable) from hr.test; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID bz8rxw5rmmv8g, child number 0 ------------------------------------- select count(nullable) from hr.test Plan hash value: 2983066704 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 3 (100)| | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| TEST_IDX | 2784 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------
A naughty trick
Since null values are not stored in the single column indexes, a single column index cannot be used to search NULL values. To demonstrate this, I will remove non NULL constraint from column NULLABLE .
SQL>alter table hr.test modify (nullable null);
Now when I query the rows having NULL values in column NULLABLE, we can see that Full table scan is used as the index on column NULLABLE does not contain NULL values .
SQL>select * from hr.test where nullable is null; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 4w1kz5f6uh43x, child number 0 ------------------------------------- select * from hr.test where nullable is null Plan hash value: 1357081020 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | |* 1 | TABLE ACCESS FULL| TEST | 1 | 8 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NULLABLE" IS NULL)
Now I will use the fact that even NULL values are stored in a composite index when at least one of the columns isn’t nullable. I am creating a multi column index on column NULLABLE and a dummy second column containing a not null value, so that the index will store even NULL values in column NULLABLE. To keep the size of the index small, I will assign the value zero to the dummy column.
SQL>drop index hr.test_idx; create index hr.test_idx on hr.test(nullable, 0);
Now when I query the rows having NULL values in column NULLABLE, we can see that the multi column index is used as it stores NULL values in column NULLABLE.
SQL>select * from hr.test where nullable is null; select * from table(dbms_xplan.display_cursor); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------- SQL_ID 4w1kz5f6uh43x, child number 0 ------------------------------------- select * from hr.test where nullable is null Plan hash value: 1389866015 ------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 1 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 8 | 1 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | TEST_IDX | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NULLABLE" IS NULL)
Conclusion:
To enable the optimizer to choose index access in relevant cases,
– declare NOT NULL constraint on relevant columns or
– create a multi column index on nullable column and a dummy second column
References:
Troubleshooting Oracle Performance (second edition ) by Christian Antognini
—————————————————————————————————————
Related links:
————————-