Monthly Archives: July 2013

UNDO AND REDO IN ORACLE

In this post, I will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reasons for duplicating the storage of undo.

First, let’s demonstrate that

– oracle redo contains undo
– Checkpointing causes
. dirty blocks to be written to datafiles
. buffers containing undo to be written to undo tablespace

Overview:

– Create a tablespace test_undo
– Create a table test_undo_tab in tablespace test_undo
– Insert two rows with txt – teststring1, teststring2 in test_undo_tab and perform manual checkpoint
– Switch logs so that new log does not contain string teststring1
– Switch undo tablespace to undotbs2
– Update the column value from teststring1 to teststring_uncommitted and do not commit
– In another session, update the column value from teststring2 to
teststring_committed and commit
– check that both new and old (undo) values have been written to current redo log
– check that undo tablespace does not contain pre update values i.e.
teststring1 and teststring2 as undo information is still in buffer cache and checkpoint has not taken place yet
– Perform manual checkpoint so that buffers containing undo information are flushed to disk
– check that undo tablespace contains pre update values i.e. teststring1 and teststring2

Implementation:

– Find out name of the current undo tablespace –

sql>sho parameter undo_tablespace
NAME                                 TYPE        VALUE
------------------------------------ ----------- 

------------------------------
undo_tablespace                      string      UNDOTBS1

– create an undo tablespace undotbs2

SQL> create undo tablespace undotbs2 datafile '/u01/app/oracle/undotbs2.dbf' 

size 100m;

– Create a tablespace test_undo

SQL> drop tablespace test_undo including contents and datafiles;
     create tablespace test_undo datafile '/u01/app/oracle/test_undo.dbf' size 128k;

– Create a test table test_undo_tab in the tablespace created above

SQL> drop table test_undo_tab purge;
     create table test_undo_tab(txt char(1000)) tablespace test_undo;

– Insert two rows in test_undo_tab and commit –

SQL> insert into test_undo_tab values ('teststring1');
     insert into test_undo_tab values ('teststring2');
     commit;

– Perform manual checkpoint so that above changes are written to the datafile

SQL>alter system checkpoint;

- Set undotbs2 as current undo_tablespace

SQL>alter system set undo_tablespace = undotbs2;
    sho parameter undo_tablespace

NAME                                 TYPE        VALUE
------------------------------------ ----------- 

------------------------------
undo_tablespace                      string      UNDOTBS2

– Switch logs so that the current log does not contain string teststring

SQL>alter system switch logfile;

– Find out name of current redo log

SQL> col member for a30
     select member, l.status from v$log l, v$logfile f where l.group# = f.group# and l.status = 'CURRENT';

MEMBER                         STATUS
------------------------------ ----------------
/u01/app/oracle/redo04.log     CURRENT

– In one session Update one row and do not commit –

SQL>  update test_undo_tab set txt = 'teststring_uncommitted'
      where txt = 'teststring1';

– In another session, Update the second row and commit –

SQL>  update test_undo_tab set txt = 'teststring_committed'
      where txt = 'teststring2';
      commit;

– Commit and then check that redo log contains both redo and undo for both the DML’s (committed and uncommitted)–

[oracle@node1 oracle]$ strings /u01/app/oracle/redo04.log|grep teststring

teststring_uncommitted                                                                                  
teststring1                                                          

teststring_committed                                                 

teststring2

– check that datafile does not contain updated values i.e. teststring_committed and teststring_uncommitted as checkpointing has not taken place yet and dirty buffers have not been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_undo.dbf|grep teststring

teststring2                                                                                                                              ,
teststring1

– check that undo tablespace does not contain pre update value i.e. teststring1 and teststring2 as undo information is in buffer cache and checkpointing has not taken place yet

[oracle@node1 oracle]$ strings /u01/app/oracle/undotbs2.dbf|grep teststring

– Perform manual checkpoint so that buffers containing undo information are flushed to disk

SQL> alter system checkpoint;

– check that datafile does contains updated values (both committed and uncommitted) i.e. teststring_committed and teststring_uncommitted as checkpointing has taken place and dirty buffers have been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_undo.dbf|grep teststring

teststring_committed                                                                                                               ,
teststring_uncommitted

– check that undo tablespace contains pre update values i.e. teststring1 and teststring2

[oracle@node1 oracle]$ strings /u01/app/oracle/undotbs2.dbf| grep teststring

teststring2                                                                  
teststring1

– Cleanup –

SQL>drop tablespace test_undo including contents and datafiles;
    alter system set undo_tablespace=undotbs1;
    drop tablespace undotbs2 including contents and datfiles;

Thus, it can be concluded that :

– Log writer writes both undo and redo for every DML in a transaction (committed/uncommitted) irrespective of whether checkpoint has taken place or not.
– On checkpoint,
. Undo present in buffers in database buffer cache is written to undo tablespace .
. Dirty buffers containing committed/uncommitted data are written to datafiles.

It implies that in a running database,

– datafiles may contain uncommitted data (buffers dirtied by uncommitted transaction and checkpoint takes place)

– committed data may not be there in datafiles (checkpointing not done after commit)

– redo/undo for committed changes will always be there in redo logs (Lgwr writes on commit).

– redo/undo for Uncommitted changes may be there in redo logs(Lgwr writes every 3 seconds)

– In addition to redo logs, undo information

  • . may be present in buffer cache (checkpoint has not taken place after change)
  • . will be written to undo tablespace on checkpoint
  • . will never be overwritten until a transaction is active

 

Now the question arises why undo information is being stored both in undo tablespace and redo logs.

Let’s see what will happen if undo is stored in redo logs only.

A redo log can be reused once changes protected by it have been written to datafiles (and archivelogs if database is in archivelog mode).

It implies that if I make a change and do not commit it
– Change is written to a redo log
– checkpoint takes place
– uncommitted change is written to datafile
– I decide to rollback the change
– If redo log has not been overwritten
. search entire redo log for the undo and then rollback
else (redo log has been overwritten)
. undo information is not available for rollback.

One might argue that if somehow a redo log is not allowed to be overwritten until it contains active undo, we might be able to manage with undo stored in redo logs only. This solution is not feasible as
– size of redo logs will grow enormously large very soon as thet contain both undo and redo (a user might decide not to end a transaction for months)
– to rollback a change, enormous amount of data in redo logs (both redo and undo) will have to be searched leading to degraded performance
– there will be contention on redo logs as they are being used for both
. writing redo and undo
. reading to rollback a change

Hence, undo information has to be stored separately from redo and is used for rolling back uncommited transactions . The undo stored in undo buffers/undo tablespace is additionally used for
– read consistency
– flashback query
– flashback version query

Now, let’s see what will happen if undo is stored in undo buffers and undo tablespace only.

SCENARIO – I

– I make a change and do not commit
– The redo for the change is present in redo logs
– The undo for the change is present in buffer cache
– The undo for the change is not present in undo tablespace as checkpoint has not taken place
– The instance crashes
– undo information present in buffer cache is wiped out
– As part of instance recovery, redo is applied and datafiles contain uncommitted data
– The database cannot be opened as undo information needed to rollback uncommitted change is not available leading to an inconsistent database.

SCENARIO – II

– I take hot backup of a tablespace.
– The datafile for the tablespace is lost or corrupted.
– I take the tablespace offline and restore the datafile from backup.
– I recover the datafile
– The redo logs and archivelogs contain redo data for both committed and uncommitted transactions
– The redo logs and archivelogs do not contain undo data (as per our
assumption).
– As part of recovery, redo for all the committed/uncommitted changes is read from online/archive redo logs and applied
– The tablespace cannot be brought online as the undo information needed to rollback uncommitted changes is not available

Hence, to resolve above problems, undo is stored in redo logs also. During roll forward phase of instance/media recovery, as changes (redo in redo logs) are applied to datafiles, undo stored in redo logs is used to generate undo segments. These segments are subsequently used to rollback uncommitted changes during rollback phase of instance/media recovery.

References:

Expert Oracle Database Architecture by Thomas Kyte

http://oraclenz.wordpress.com/2008/06/22/differences-between-undo-and-redo/
https://forums.oracle.com/forums/thread.jspa?threadID=1016183&start=0&tstart=0
https://forums.oracle.com/forums/thread.jspa?threadID=672861

https://jonathanlewis.wordpress.com/2015/04/20/manuals-2/
Does Select Statement Generate Redo?
Hot Backup Mode Causes Excessive Redo
https://www.dbametrix.com/download/Redo-Log-In-depth.pdf
Delayed Block Cleanout In Oracle

—————————————————————————————-

Related links:

Home

Database Index

Consistent Reads In Oracle : Part-I
Consistent Reads In Oracle : Part-II
Flush Buffer Cache
Oracle Checkpoints
Uncommitted Data In Datafiles

 

—————————————————————————————-

 

UNCOMMITTED DATA IN DATAFILES

In this post, I will demonstrate various scenarios in single instance where uncommitted data gets written to datafiles.

The database writer can write uncommitted info also to datafiles as a result of following activities:

Flush buffer cache
Before a logfile gets overwritten
Manual checkpoint
To free buffers
Tablespace offline
Tablespace read only
Tablespace begin backup

– SETUP –

– create tablespace test_dbwr

SQL>drop tablespace test_dbwr including contents and datafiles;
    create tablespace test_dbwr datafile '/u01/app/oracle/test_dbwr.dbf' size 100m;

– create table test_dbwr_tab

SQL>drop table test_dbwr_tab purge;
    create table test_dbwr_tab (txt char(100)) tablespace test_dbwr;

- insert a record with txt = test_init

SQL>insert into test_dbwr_tab values ('test_init');
    commit;
    alter system checkpoint;

– find out current checkpoint#

SQL>select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
——————
983019

– Switch logfile

SQL>alter system switch logfile;

Flush buffer cache

If buffer cache containing dirty blocks is flushed, dirty blocks containing uncomitted data are written to datafiles.

— Overview–

— find out current checkpoint#
— Update record in test_dbwr_tab to ‘test_dbwr’ and do not commit
— check that datafile does not contain string ‘test_dbwr’ as checkpoint has not taken place yet
— Flush buffer cache so that all the dirty blocks in the buffer cache are
flushed to disk by database writer
— Check that checkpoint# has been incremented
— check that datafile contains the string ‘test_dbwr’ as dirty blocks
have been flushed to disk

Implementation

– find out current checkpoint#

SQL>select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
            983019

– Update record in test_dbwr_tab to ‘test_dbwr’ and do not commit

SQL>update test_dbwr_tab set txt = 'test_dbwr';

-- check that datafile does not contain string ‘test_dbwr’ as checkpoint has not taken place yet

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_dbwr

– Flush buffer cache so that all the dirty blocks in the buffer cache are flushed to disk by database writer

SQL>alter system flush buffer_cache;

– Check that checkpoint# has been incremented

SQL> select checkpoint_change# from v$database;

CHECKPOINT_CHANGE#
------------------
            983038

– check that datafile contains the string ‘test_dbwr’ as dirty blocks have been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_dbwr
dtest_dbwr

————————————————————————

Before a logfile gets overwritten

A redo log cannot be overwritten until changes recorded in it are written to disk by DBWR. Hence, before a redo log containing uncommitted data is overwritten, uncommitted changes recorded in it are written to datafiles by DBWR.

– Overview –
— check the no. of logfile groups and the current logfile group
— Check current checkpoint#
— update record to string ‘test_log_switch’ in table test_dbwr_tab
— check that datafile does not contain string ‘test_log_switch’
as dirty buffers have not been written to disk yet
— Switch logfile so that curent redo log gets overwritten
— Check that checkpoint# has incremented
— check that datafile contains the string ‘test_log_switch’ as dirty buffers
have been flushed to disk

– Implementation –
— check the no. of logfile groups and the current logfile group

SQL>select group#, status from v$log ;
   GROUP# STATUS
---------- ----------------
         4 INACTIVE
         5 INACTIVE
         6 CURRENT

– Check current checkpoint#

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983038

– update record to string ‘test_log_switch’ in table test_dbwr_tab

SQL>update test_dbwr_tab set txt = 'test_log_switch';

– check that datafile does not contain string ‘test_log_switch’  as dirty buffers have not been written to disk yet

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_log_switch

– Switch logfile so that curent redo log gets overwritten

SQL>alter system switch logfile;
    /
    /
    /
    select group#, status from v$log ;

– Check that checkpoint# has incremented

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983337

– check that datafile contains the string ‘test_log_switch’ as dirty buffers
have been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_log_switch
dtest_log_switch

————————————————————————–

Manual checkpoint

Whenever checkpoint is initiated manually, all the dirty blocks containing committed/uncommitted data in buffer cache are flushed to disk .

– Overview –
— Find out current checkpoint#
— Update the record to string ‘test_checkpoint’ in table test_dbwr_tab
— check that datafile does not contain string ‘test_checkpoint’
— Perform manual checkpoint so that all the dirty buffers get written
to disk
— Check that checkpoint# has incremented
— check that datafile contains the string ‘test_checkpoint’ as dirty buffers
have been flushed to disk

– Implementation –

– Find out current checkpoint#

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983337

– Update the record to string ‘test_checkpoint’ in table test_dbwr_tab

SQL>update test_dbwr_tab set txt = 'test_checkpoint';

– check that datafile does not contain string ‘test_checkpoint’

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_checkpoint

– Perform manual checkpoint so that all the dirty buffers get written to disk

SQL>alter system  checkpoint;

– Check that checkpoint# has incremented

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983842

– check that datafile contains the string ‘test_checkpoint’ as dirty buffers have been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_checkpoint
dtest_checkpoint

——————————————————————

Tablespace offline

When a tablespace is taken offline, the  dirty  buffers containinig committed/uncommitted data belonging to it are flushed to disk.

– Overview –
— Find out current checkpoint#
— Find out current checkpoint# for datafile of test_dbwr tablespace
— Update the record to string ‘test_offline’ in table test_dbwr_tab
— check that datafile does not contain string ‘test_offline’
— Take tablespace test_dbwr offline
— Note that database checkpoint# does not change
— Note that checkpoint# for the datafile for test_dbwr tablespace has been incremented
— check that datafile contains the string ‘test_offline’ as dirty buffers
— have been flushed to disk
— Bring tablespace online —

– Implementation –

– Find out current checkpoint#

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983842

– Find out current checkpoint# for datafile of test_dbwr tablespace

SQL>select checkpoint_change# from V$datafile
    where upper(name) like '%TEST_DBWR%';

CHECKPOINT_CHANGE#
------------------
            983842

– Update the record to string ‘test_offline’ in table test_dbwr_tab

SQL>update test_dbwr_tab set txt = 'test_offline';

– check that datafile does not contain string ‘test_offline’

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_offline

– Take tablespace test_dbwr offline –

SQL>alter tablespace test_dbwr offline;

– Note that database checkpoint# does not change

SQL>select checkpoint_change# from V$database;
CHECKPOINT_CHANGE#
------------------
            983842

– Note that checkpoint# for the datafile for test_dbwr tablespace has been incremented

SQL>select checkpoint_change# from V$datafile
    where upper(name) like '%TEST_DBWR%';

CHECKPOINT_CHANGE#
------------------
            984338

– check that datafile contains the string ‘test_offline’ as dirty buffers – have been flushed to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_offline
dtest_offline

– Bring tablespace online –

SQL>alter tablespace test_dbwr online;

————————————————————————–

Similarly, it can be demonstrated that uncommitted data gets written to datafiles whenever
– a tablespace is put in read only mode
– a tablespace is put in backup mode

In these cases also, database checkpoint# is not incremented but checkpoint# for the datafile of the tablespace gets incremented

To free buffers in buffer cache

When a server process is looking for free buffers and they are not available, dirty buffers containing uncommitted/committed data from cold end of LRU list  will be flushed to disk. The flushed buffers may not be the ones with oldest SCN.  Hence, in this case checkpoint# is not incremented.

– Overview –

— Find out current size of database buffer cache
— Decrease size of buffer cache to small value 4M
— check that buffer cache has been shrunk to 4M
— check checkpoint# for the database —
— Update the record to string ‘test_buffer’ in table test_dbwr_tab
— Issue a query on a large table so that dirty buffers in buffer cache have to be flushed to disk in order to free buffers
— check that datafile now contain the strings ‘test_buffer’ as dirty buffers have been written to disk
— Note that checkpoint # has not changed in this case although dirty buffers have been written to disk(LRU/TCH algorithm)
— Reset buffer cache to earlier size

Implementation

– Find out current size of database buffer cache

SQL> col component for a20
     select component, current_size/1024/1024 current_mb, min_size/1024/1024 MIN_MB
     from v$sga_dynamic_components
     where component = 'DEFAULT buffer cache';

COMPONENT            CURRENT_MB     MIN_MB
-------------------- ---------- ----------
DEFAULT buffer cache         28         28

– Decrease size of buffer cache to minimum value i.e. 4M

SQL>alter system set db_cache_size=4m;

– check that buffer cache has been shrunk to 4M

SQL> select component, current_size/1024/1024 current_mb, min_size/1024/1024 MIN_MB
     from v$sga_dynamic_components
     where component = 'DEFAULT buffer cache';

OMPONENT            CURRENT_MB     MIN_MB
-------------------- ---------- ----------
DEFAULT buffer cache          4          4

– check checkpoint# for the database –

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983842

– Update the record to string ‘test_buffer’ in table test_dbwr_tab

SQL>update test_dbwr_tab set txt = 'test_buffer';

– check that datafile does not contain string ‘test_buffer”

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_buffer

– Issue a query on a large table so that dirty buffers in buffer cache have to be flushed to disk in order to free buffers

SQL>select * from sh.sales;

– check that datafile now contain the strings ‘test_buffer’ as dirty buffers have been written to disk

[oracle@node1 oracle]$ strings /u01/app/oracle/test_dbwr.dbf| grep test_buffer
dtest_buffer

– Note that checkpoint # has not changed in this case although dirty buffers have been written to disk(LRU/TCH algorithm)

SQL>select checkpoint_change# from V$database;

CHECKPOINT_CHANGE#
------------------
            983842

– Reset buffer cache to earlier size

SQL>alter system set db_cache_size=44m;
    select component, current_size/1024/1024 current_mb, min_size/1024/1024 MIN_MB
     from v$sga_dynamic_components
     where component = 'DEFAULT buffer cache';

– cleanup –

SQL>drop tablespace test_dbwr including contents and datafiles;

Summary:

– When uncommitted data is flushed to disk as a result of

  •      . flush buffer cache,
  •      . manual checkpoint or
  •      . to write changes in a redo log to be overwritten,

checkpoint# for the database is incremented.

– When uncommitted data belonging to a tablespace  is flushed to disk as a result of

  •        . tablespace offline,
  •        . tablespace read only or
  •        . tablespace begin backup

checkpoint# for the database is not incremented but checkpoint# for the datafile(s) belonging to the tablespace is incremented.

– When uncommitted data   is flushed to disk to free buffers in the buffer cache,    checkpoint# for the database is not incremented .

References:

Oracle Documentation

https://community.oracle.com/thread/886580
———————————————————————–

Related links:

Home

Database  Index

Consistent Reads In Oracle : Part-I
Consistent Reads In Oracle : Part-II
Flush Buffer Cache
Oracle Checkpoints
Undo And Redo In Oracle