Tuning the Redolog Buffer Cache and Resolving Redo Latch Contention

posted Jul 26, 2011, 8:17 AM by Sachchida Ojha

TUNING THE REDOLOG BUFFER


1. What is the Redolog Buffer
 

The redo log buffer is a circular buffer in the SGA that holds information about changes made to the database. This information is stored in redo entries. Redo entries contain the information necessary to reconstruct, or redo, changes made to the database . Redo entries are used for database recovery, if necessary.

Redo entries are copied by Oracle server processes from the user's memory space to the redo log buffer in the SGA. The redo entries take up continuous, sequential space in the buffer. The background process LGWR writes the redo log buffer to the active online redo log file (or group of files) on disk.

The initialization parameter LOG_BUFFER determines the size (in bytes) of the redo log buffer. In general, larger values reduce log file I/O, particularly if transactions are long or numerous. The default setting is four times the maximum data block size for the host operating system.


2. Redolog Latches

 
When a change to a data block needs to be done, it requires to create a redo record in the redolog buffer executing the following steps:
    • Ensure that no other processes has generated a higher SCN
    • Find for space available to write the redo record. If there are not space available a the LGWR must write to disk or issue a log switch
    • Allocate the space needed in the redo log buffer
    • Copy the redo record to the log buffer and link it to the appropriate structures for recovery purposes.
The database has three redo latches to handle this process:
  • Redo Copy latch
    • The redo copy latch is acquired for the whole duration of the process described above. The init.ora LOG_SIMULTANEOUS_COPIES determines the number of redo copy latches. It is only released when a log switch is generated to release free space and re-acquired once the log switch ends.
  • Redo  allocation latch
    • The redo allocation latch is acquired to allocate memory space in the log buffer. Before Oracle9.2, the redo allocation latch is unique and thus serializes the writing of entries to the log buffer cache of the SGA. In Oracle 9.2. Entreprise Edition, the number of redo allocation latches is determined by init.ora LOG_PARALLELISM.   The redo allocation latch allocates space in the log buffer cache for each transaction entry.  If transactions are small, or if there is only one CPU on the server, then the redo allocation latch also copies the transaction data into the log buffer cache. If a logswitch is needed to get free space this latch is released as well with the redo copy latch.
  • Redo writing latch
    • This unique latch prevent multiple processes posting the LGWR  process requesting log switch simultaneously. A process that needs free space must acquire the latch before of deciding whether to post the LGWR to perform a write, execute a log switch or just wait.
3. Instance Parameters Related with the Redolog Latches
 
In Oracle7 and Oracle 8.0, there are two parameters that modify the behavior of the latch allocation in the redolog buffer: LOG_SIMULTANEOUS_COPIES (This parameter controls the number of redo copy latches when the system has more than one CPU),  and LOG_SMALL_ENTRY_MAX_SIZE. When LOG_SIMULTANEOUS_COPIES is set to a non-zero value, and the size of the transaction entry is smaller than the value of the LOG_SMALL_ENTRY_MAX_SIZE parameter then the copy of the transaction entry into the log buffer cache is performed by the redo allocation latch.  If the size of the transaction entry exceeds LOG_SMALL_ENTRY_MAX_SIZE, then the transaction entry is copied into the log buffer cache by the redo copy latch.

In Oracle8i and Oracle9.0, a redo copy latch is always required regardless of the redo size so the check is no longer performed. The init.ora LOG_SIMULTANEOUS_COPIES becomes obsolete and the number of redo copy latches defaults to twice the number of cpus. The parameter LOG_SMALL_ENTRY_MAX_SIZE is also obsolete. For further detail on the change of this parameters in Oracle 8i seeNote:94271.1

In Oracle9.2 and higher, multiple redo allocation latches become possible with init.ora LOG_PARALLELISM. The log buffer is split in multiple LOG_PARALLELISM areas that each have a size of init.ora LOG_BUFFER. The allocation job of each area is protected by a specific redo allocation latch. The number of redo copy latches is still determined by the number of cpus


4. Detecting and Resolving Redolog Buffer Performance Problem

 
Contention in the redolog buffer will impact the performance of the database since all DML and DDL must record a entry before of being executed. Contention can be seen as a latch contention or as excessive request for free space in the log buffer.

Note: In general log buffer contention is not frequent problem unless the latches already mentioned are consistently in the top wait events. Experience usually shows redo IO throughput is the main culprit of redo contention.

The database allow you to detect both types of contention as described below:
    • Latch contention
    • The following query determines the miss ratio and the "immediate" miss ratio for redolog latches.

      SELECT  substr(ln.name, 1, 20), gets, misses, immediate_gets, immediate_misses
      FROM v$latch l, v$latchname ln
      WHERE   ln.name in ('redo allocation', 'redo copy')
                      and ln.latch# = l.latch#;

      If the ratio of MISSES to GETS exceeds 1%, or the ratio of IMMEDIATE_MISSES to (IMMEDIATE_GETS + IMMEDIATE_MISSES) exceeds 1%, there is latch contention.

      Note: Oracle recommends to tune first the redo allocation latch rather than the redo copy latch.

      In Oracle7 and Oracle8.0:
      If the contention is caused by redo allocation latch decrease the value of LOG_SMALL_ENTRY_MAX_SIZE. The recommended value is the average of redo size which can be calculated as (redo size/redo entries) from V$SYSSTAT.
      If you find redo copy latch contention, you can increase the parameter LOG_SIMULTANEOUS_COPIES to have more latches available. The recommended value is twice the numbers of CPUs.

      In Oracle8i and Oracle9.0:
      If the contention is caused by redo allocation latch you can either use the NOLOGGING option to reduce the amount of redo log entries for certain operations (See Note:147474.1) or reduce the load on the latch increasing the LOG_BUFFER PARAMETER.


      If you find redo copy latch contention, you can increase the hidden init.ora _LOG_SIMULTANEOUS_COPIES to have more latches available. The default is twice the numbers of CPUs.

      In Oracle 9.2:
      If the contention is caused by redo allocation latch you can try to increase their number via init.ora LOG_PARALLELISM
      If you find redo copy latch contention, you can increase the hidden init.ora _LOG_SIMULTANEOUS_COPIES to have more latches available. The default is twice the numbers of CPUs.

      In Oracle 10.2 and higher:
      The statistic REDO BUFFER ALLOCATION RETRIES reflects the number of times a
      user process waits for space in the redo log buffer. The value of redo buffer
      allocation retries should be near zero over an interval. If this value increments
      consistently, then processes have had to wait for space in the redo log buffer.
      The wait can be caused by the log buffer being too small or by checkpointing.
      Increase the size of the redo log buffer, if necessary, by changing the value of
      the initialization parameter LOG_BUFFER. The value of this parameter is expressed
      in bytes.

     
    • Request for space contention
       
      The statistic "redo log space requests" reflects the number of times a user process waits for space in the redo log file, not the buffer space ..  This statistic is available through the dynamic performance table V$SYSSTAT. By default, this table is only available to the user SYS and to users granted SELECT ANY TABLE system privilege, such as SYSTEM.  Monitor this statistic over a period of time while
      your application is running with this query:

         SELECT name, value
         FROM v$sysstat
         WHERE name = 'redo log space requests';

      The value of "redo log space requests" should be near 0. If this value increments consistently, processes have had to wait for space in the buffer. This may be caused the checkpointing or log switching.  Improve thus the checkpointing or archiving process.

Comments