Conditions Based On Inequalities Can’t Use Indexes – How To Resolve?

Conditions based on inequalities (!=, <>) cannot make use of index(es). I will illustrate this limitation and show you how to optimize SQL statements hitting it.

For the demonstration, I have  a table  students table having a column named result that  can contain the values – ‘Pass’, ‘Fail’, ‘To be evaluated’. The column is characterized by a very non-uniform distribution having most of the rows  set to value Passed (P). Here’s the example:

SQL>drop table students purge;
    create table students (id , result )
    as
    select rownum, decode (mod(rownum, 30), 0, 'F', 1, 'T',  'P')
    from  all_tables;

    create index students_idx on students (result);
    exec dbms_stats.gather_table_stats (USER, 'STUDENTS', cascade => TRUE);

     SELECT result , count(*)
     FROM students
     GROUP BY result;
RESULT COUNT(*)
---------- ----------
P              100
T                4
F                3

Let’s execute the  query to select all students who have not passed (result = ‘T’ or ‘F’). Even though the query has a very strong selectivity and the result column is indexed, the query optimizer chooses a full table scan for reading 7 rows as the predicate involves inequality.

SQL>select * from students where result <> 'P';
    select * from table(dbms_xplan.display_cursor);

ID RESULT
---------- ----------
1 T
30 F

....

7 rows selected.

PLAN_TABLE_OUTPUT
---------------------------------------------------------
SQL_ID f2wkxqy3b6b5h, child number 0
-------------------------------------
select * from students where result <> 'P'

Plan hash value: 4078133427
---------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | TABLE ACCESS FULL| STUDENTS | 71 | 355 | 3 (0)| 00:00:01 |
---------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RESULT"<>'P')

In a case like this, where the inequality condition has a strong selectivity, we can advantage of an index using following four techniques :

First, the inequality condition can be rewritten into an IN condition. This is an option only when the number of values to be selected is known and the number is limited. For example, if the query is modified as shown, index range scan is employed.

SQL>select * from students where result in ('F', 'T');
select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------
SQL_ID 672mnj9pggkq7, child number 0
-------------------------------------
select * from students where result in ('F', 'T')

Plan hash value: 2871222462
---------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 71 | 355 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | STUDENTS_IDX | 71 | | 1 (0)| 00:00:01 |
---------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(("RESULT"='F' OR "RESULT"='T'))

Second,   manually rewrite the query to make sure that both component queries can take advantage of an index range scan. This technique  can be applied if the values are unknown or the number of values to be specified is too high.   Hence, if  the query is rewritten as shown, it will be able to to take advantage of the or expansion query transformation:

SQL>select * from students where result < 'P'
    union all
    select * from students where result > 'P' ;
    select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------- 
SQL_ID gqrp063y9c5a5, child number 0
-------------------------------------
select * from students where result < 'P' union all select * from
students where result > 'P'

Plan hash value: 2171568329
--------------------------------------------------------- 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------- 
| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | UNION-ALL | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 76 | 380 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | STUDENTS_IDX | 76 | | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 36 | 180 | 2 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | STUDENTS_IDX | 36 | | 1 (0)| 00:00:01 |
--------------------------------------------------------- 

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("RESULT"<'P')
5 - access("RESULT">'P')

– Third technique: We can create function based index using decode / case and restructure our query to use decode / case 

In this technique, we will create a function based index using decode or case which will return two distinct values when the the condition is satisfied / not satisfied. Subsequently, we will restructure or query to use decode / case.

Using decode

SQL>create index students_decode_idx  
    on students (decode (result, 'P', 'Passed', 'Not Passed'));
    select * from students 
    where decode (result, 'P', 'Passed', 'Not Passed') = 'Not Passed';

select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID 2ddr5p1t2q7ax, child number 0
-------------------------------------
select * from students where decode (result, 'P', 'Passed', 'Not
Passed') = 'Not Passed'

Plan hash value: 3076565877
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 1 | 12 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | STUDENTS_DECODE_IDX  | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STUDENTS"."SYS_NC00004$"='Not Passed')

Using Case

SQL>create index students_case_IDX 
      on students (CASE result
                    WHEN 'P' THEN 'Passed'
                    ELSE 'Not Passed' END);

select * from students
where (CASE result
         WHEN 'P' THEN 'Passed'
          ELSE 'Not Passed' END) = 'Not Passed';

select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------SQL_ID 6pqgppzp47tzb, child number 0
-------------------------------------
select * from students where (CASE result WHEN 'P' THEN
'Passed' ELSE 'Not Passed' END) = 'Not Passed'

Plan hash value: 2622858639
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 1 | 12 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | STUDENTS_CASE_IDX | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("STUDENTS"."SYS_NC00004$"='Not Passed')

The fourth technique simply forces an index full scan with, for example, the index hint. From a performance point of view, it’s not optimal,as, for a query with very strong selectivity, full index has to be scanned.

SQL>SELECT /*+ index(students) */ * FROM students where result != 'P';
select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------- 
SQL_ID 2hyrf6n7kb8pr, child number 0
-------------------------------------
SELECT /*+ index(students) */ * FROM students where result != 'P'

Plan hash value: 635752001
---------------------------------------------------------  
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------- 
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| STUDENTS | 71 | 355 | 2 (0)| 00:00:01|
|* 2 | INDEX FULL SCAN | STUDENTS_IDX | 71 | | 1 (0)| 00:00:01 |
--------------------------------------------------------- 

Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("RESULT"<>'P')

Conclusion:

In cases where the inequality condition having a strong selectivity is notable to make use of an index, we can advantage of an index using following three techniques : 

  • First, the inequality condition can be rewritten into an IN condition. This is an option only when the number of values to be selected is known and the number is limited.
  • Second,   manually rewrite the query to make sure that both component queries can take advantage of an index range scan. This technique  can be applied if the values are unknown or the number of values to be specified is too high.
  • Third ,  create function based index using decode / case and restructure the query to use decode / case 
  • The fourth technique simply forces an index full scan with, for example, the index hint. From a performance point of view, it’s not optimal,as, for a query with very strong selectivity, full index has to be scanned.

References:

—————————————————————————————————————

Related links:

Home
Tuning Index

————————-

 

4 thoughts on “Conditions Based On Inequalities Can’t Use Indexes – How To Resolve?

  1. hello ma’am i did the same exercise,queries are using indexes now by all solution queries but cpu cost increased ,physical read also increase and also bytes sent via sql*net to client also increase.By using inequlities operator although it was not using index but cpu cost,physical reads and bytes sent to client was lesser than when it was using indexes.? Any suggesitons

    1. Hello Jai,

      Since size of table is small, benefit of using indexes is not visible. When FTS is used all the blocks below HWM are visited. When is index is accessed, index blocks also need to be accessed in addition to the table block(s) having data. When index is accessed for the first, it has to be read from disk, hence physical reads. Subsequent accesses to index are from buffer cache which accounts increase in logical reads.

      Try the same exercise with large table. You will definitely observer improvement in logical / physical I/O and CPU usage when index is used.

      Hope it helps.

      Regards
      Anju Garg

Your comments and suggestions are welcome!