Tracing sessions: waiting on an enqueue

posted Jul 11, 2011, 6:44 AM by Sachchida Ojha
PURPOSE
-------

How to monitor and trace locking problems (enqueues).


SCOPE
-----

DBA's and people working with technical support.


APPLICATION
-----------

This bulletin will assist you when:

1. Requiring to figure out if an instance has had locking problems with 
   enqueues.  See section "Has locking been a problem".

2. Are there sessions just now in the instance that are waiting on
   enqueues and who are the blockers. See "Is locking a problem just now".


Abstract
--------

What are locks, enqueues and latches?
What are the common types of enqueues?
Has locking been a problem?
Is locking a problem just now?

All those questions will be answered in this document.

To get better performance (get a better flow), one approach is to
eliminate the time a session is waiting to get a resource (for example
an enqueue). You don’t want sessions that are waiting on other sessions,
because those waiting sessions may also hold enqueues and resources, which
will make other sessions wait for them - which can end up with a deadlock.

===============================================================================

What are locks, enqueues and latches
-----------------------------------

Often the word "lock" is synonymous to enqueue. In this document, "enqueue" 
will be used for the locking mechanism that you can find in V$LOCK. "locking" 
will be used when talking about requesting an enqueue, waiting on an enqueue 
and blocking others while holding the enqueue.

A lock is a resource that you need to hold to get access to the resource.
Oracle has two kind of locks: enqueues and latches. This bulletin will not
talk about latches. A latch is a fast, inexpensive and non-sophisticated
lock. A latch is used when you need serial access to operations/functions
in Oracle. There is no ordered queue for the waiters on a latch - first to
grab it gets it.

Enqueues are sophisticated locks for managing access to shared resources
(like tables, rows, jobs, redo threads). An enqueue can be requested in
different levels/mode: null, row share, row exclusive, share, share row
exclusive or exclusive. If a session holds an enqueue in share mode, other
sessions can then also take the enqueue in share mode (for the same resource).
If a session holds an enqueue in exclusive mode, other sessions that wants to
get it - independently in which level - they have to wait.

When access is required by a session, a lock structure is obtained and a
request is made to acquire access to the resource at a specific level (mode)
is made. The lock structure is placed on one of three linked lists which hang
off of the resource, called the OWNER (if the enqueue could be acquired),
WAITER (if the session is waiting to acquiring the enqueue) and CONVERTER
(the session is holding the enqueue in one level but wants to convert it to
another) lists.

An enqueue is uniquely defined by it's TYPE, ID1 and ID2 (which are columns
in the V$LOCK view). For example can there only be one enqueue for user
SCOTT's EMP table (identified by TYPE=TM, ID1=<object ID>, ID2=0).


Common types of enqueues
------------------------

· JQ - Job Queue. When a job (submitted by DBMS_JOB.SUBMIT) is running,
  it is protected by a JQ enqueue (which means that only one SNP-process
  can run the job).

· ST - Space management Transaction. The ST enqueue is need to be held
  every time the session is allocating/deallocating extents (which means
  wants to change the UET$ and FET$ dictionary tables), like coalescing,
  drop/truncate segments and disk -sorting. If the session gets a timeout
  when requesting the ST enqueue, "ORA-1575 timeout waiting for space
  management" is returned.

· TM - DML (Table) enqueue. Every time a session wants to lock a table,
  a TM enqueue is requested. If a session deletes a row in the parent-table
  (DEPT) and a referential constraint (foreign key) is created without
  an index on the child-table (EMP), or if the session is updating the
  column(s) that the foreign key references to then a share lock (level 4)
  is taken on the child table. If another session tries to do changes to
  the child-table they have to wait (because they want the enqueue in row
  exclusive mode, and that is not compatible with the share mode). If an
  index is created on the child-table’s foreign key-column, then no share-
  lock is required on the child-table.

· TX - Transaction. As soon as a transaction is started a TX enqueue is
  needed. A transaction is uniquely defined by the rollback segment number,
  the slot number in the rollback segment’s transaction table and the slot
  number’s sequence number. A session can be waiting on a TX enqueue for
  several reasons: 1) Another session is locking the requested row.
  2) When two sessions tries to insert the same unique key into a table
  (none of them has done a COMMIT), then the last session is waiting for
  the first one to COMMIT or ROLLBACK. 3) There are no free ITL (Interested
  Transaction List) in the block header (increase INI_TRANS och PCT_FREE
  for the segment).

· UL - User Lock. A session has taken a lock with the DBMS_LOCK.REQUEST
  function.

===============================================================================

Has locking been a problem
--------------------------

To see if the instance has had locking problems since startup, execute the
following SELECT:

  SELECT * 
    FROM v$sysstat 
   WHERE class=4;

STATISTIC# NAME                     CLASS     VALUE
        22 enqueue timeouts             4         0
        23 enqueue waits                4         2
        24 enqueue deadlocks            4         0
        25 enqueue requests             4       213
        26 enqueue conversions          4         0
        27 enqueue releases             4       204

· "enqueue timeouts" shows the total number of enqueue operations (get
  and convert) that timed out before they could complete.

· "enqueue waits" shows how many times a session had to wait for an
   enqueue. This column is updated first after the wait is finished.

   FYI:

      enqueue waits time out (usually every 3 seconds). Hence a single session 
      waiting for 1 minute could show up as 20 waits in the "wait event" section
      of a BSTAT/ESTAT report but only ONE enqueue wait in the 
      "statistics"(output from V$SYSSTAT) section of the BSTAT/ESTAT.      

· "enqueue deadlocks" is how many times a deadlock situation has
  occured (every time the client receives an ORA-60 and a trace file
  will be created). This value should be zero, else an investigation
  should be made and the trace files should be checked.

· "enqueue requests" minus "enqueue releases" shows how many locks that
  are held just now (which is equal to the number of rows in V$LOCK).

· "enqueue conversions" is how many times an session is holding a lock
  in one mode and then wants to change the mode (for example, first the
  session is doing an SELECT * FROM emp FOR UPDATE and then a UPDATE emp
  SET sal=9876 WHERE empno=7839).

· "enqueue releases" shows how many times an enqueue was released (freed).

Maybe the session(s) that created these waits is still connected. If
they are, you find them in V$SESSTAT (check first with V$STATNAME or
V$SYSSTAT that statistic# 23 is "enqueue waits"):

  SELECT * 
    FROM v$sesstat 
   WHERE statistic# = 23 AND value > 0;

--------------------------------------------------------------------------------

If a session has been waiting for an enqueue, then a row in V$SYSTEM_EVENT is
created. If the instance is running with TIMED_STATISTICS = TRUE, you can see
for how long time the session had to wait for the enqueue:

  SELECT *
    FROM v$system_event
   WHERE event = 'enqueue';

EVENT        TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT
enqueue               15             13        4193    279.53333

· TOTAL_WAITS and AVERAGE_WAIT are not showing what you are expecting.
  The real number of waits are "enqueue waits" in V$SYSSTAT, or by taking
  TOTAL_WAITS minus TOTAL_TIMEOUTS. To get the correct AVERAGE_WAIT you
  should take TIME_WAITED (4193) divided by the real number of waits (2).
  So the real average wait time is not 2.79 seconds - it is 20.96 seconds.

With the following SELECT you can see for how long time the connected
sessions had to wait for an enqueue and how long the maximum wait time was:

  SELECT COUNT( * ), SUM( time_waited ), MAX( max_wait )
    FROM v$session_event
   WHERE event = 'enqueue';

*) Remember that in V$SYSTEM_EVENT you see the total values for all sessions
   that has been connected to the instance and those that are still connected.
   In V$SESSION_EVENT you will only see values for those sessions that are
   connected to the instance just now.
-------------------------------------------------------------------------------

To see the gets and waits that had been on the different types of enqueues,
check in the fixed internal table X$KSQST:

  SELECT *
    FROM x$ksqst
   WHERE ksqstget > 0;

ADDR          INDX   INST_ID KS  KSQSTGET  KSQSTWAT
01ECBE30        69         1 CF         9         0
01ECBE48        72         1 CI        12         0
01ECBEA8        84         1 CU        57         0
01ECBF60       107         1 DL        28         0
01ECBF90       113         1 DR         1         0
01ECBFB0       117         1 DV        12         0
01ECC498       274         1 IS         5         0
01ECC890       401         1 MR        16         0
01ECCB30       485         1 PF         1         0
01ECCDA0       563         1 RT         1         0
01ECCE88       592         1 SQ         1         0
01ECCEA0       595         1 ST         1         0
01ECCF68       620         1 TM        14         0
01ECCF98       626         1 TS         1         0
01ECCFC0       631         1 TX        13         2
01ECD098       658         1 US        59         0

· The sum of KSQSTWAT is equal to "enqueue waits" in V$SYSSTAT, so here you
  can see what type of locks that the sessions and background processes
  had been waiting on.

===============================================================================

Is locking a problem currently
------------------------------

This is an example of how you can trace locking problems.

Background: User SCOTT has updated the salary for all employees in table EMP.
Then user SYSTEM tries to update the JOB-description for employee KING in the
same table.


The SELECTS
-----------

Which session is the blocker and which sessions are waiting to get the lock?
This SELECT will return no rows if there are no waiters and blockers. You can
never find situations when there is only blocker(s) or waiter(s). One session
can block many others.

    column Username format A15 column Sid format 9990 heading SID
    column Type format A4 column Lmode format 990 heading 'HELD'
    column Request format 990 heading 'REQ' column Id1 format 9999990
    column Id2 format 9999990 break on Id1 skip 1 dup

    SELECT SN.Username, M.Sid, M.Type,
        DECODE(M.Lmode, 0, 'None', 1, 'Null', 2, 'Row Share', 3, 'Row
        Excl.', 4, 'Share', 5, 'S/Row Excl.', 6, 'Exclusive',
        LTRIM(TO_CHAR(Lmode,'990'))) Lmode,
        DECODE(M.Request, 0, 'None', 1, 'Null', 2, 'Row Share', 3, 'Row
        Excl.', 4, 'Share', 5, 'S/Row Excl.', 6, 'Exclusive',
        LTRIM(TO_CHAR(M.Request, '990'))) Request,
        M.Id1, M.Id2 
    FROM V$SESSION SN, V$LOCK M
    WHERE (SN.Sid = M.Sid and M.Request ! = 0)
        or (SN.Sid = M.Sid and M.Request = 0 and Lmode != 4 and (id1, id2)
        in (select S.Id1, S.Id2 from V$LOCK S where Request != 0 and S.Id1
        = M.Id1 and S.Id2 = M.Id2) ) order by Id1, Id2, M.Request; 


USERNAME             SID TY LMODE         REQUEST             ID1        ID2
---------------- ------- -- ------------- -------------  ---------- --------
                               
SCOTT                 9  TX Exclusive     None             262154        117

SCOTT                11  TX None          Exclusive        262154        117

 Session 9 is blocking(LMODE=Exclusive)
 Session 11 is waiting(REQUEST=Exclusive)
 The meaning of ID1 and ID2 depend on the lock TYPE.

· We can see situations where a session is both a Blocker and a Waiter.
· If there are only two sessions and both are Blockers and Waiters then we
  got a deadlock situation (which Oracle will solve automatically).
· A fast way to check blocking/waiting situations is to run:

  SELECT * FROM v$lock WHERE block > 0 OR request > 0;


**NEW for Oracle 10g**
~~~~~~~~~~~~~~~~~~~~~~

 V$SESSION now tracks the same wait data as V$SESSION_WAIT

Please see Note:245981.1 10g New Functionality of V$SESSION and WAIT MODEL


===============================================================================

Get some common information about these sessions from V$SESSION:

  SELECT sid, taddr, lockwait, status, sql_address,
         row_wait_obj# RW_OBJ#, row_wait_file# RW_FILE#, row_wait_block#
         RW_BLOCK#, row_wait_row# RW_ROW#
    FROM v$session 
   WHERE sid IN( 7, 8 ) 
   ORDER BY sid;

SID TADDR    LOCKWAIT STATUS   SQL_ADDRESS RW_OBJ# RW_FILE# RW_BLOCK# RW_ROW#
  7 01F2D014 01EC9858 ACTIVE      0224C9A8    2494        1     15631       8
  8 01F2CCBC          INACTIVE    02258D80      -1        0         0       0

· The column TADDR could be joined with ADDR in V$TRANSACTION.

· The sessions that are waiting to get a lock has a value (IS NOT NULL)
  in LOCKWAIT.

· Status for a waiting session is ACTIVE. There are a lot of SQL:s around,
  that counts the number of active sessions by looking at this column.

· SQL_ADDR is the address for the last SQL statement that the session executed.
  Could be joined with V$SQL.ADDRESS and V$OPEN_CURSOR.ADDRESS. For the Waiter
  this is the SQL-command that made him wait. For the blocker this could be the
  SQL that makes the block but it could also be any later SQL.

· If ROW_WAIT_OBJ# has not the value -1, then it means that the session is
  waiting on a lock for this object. Could be joined with
  DBA_OBJECTS.OBJECT_ID or DBA_OBJECTS.DATA_OBJECT_ID.


     select DATA_OBJECT_ID from  dba_objects
     where OBJECT_ID = ROW_WAIT_OBJ#;

· The other ROW_WAIT* columns shows where the row is located that the
  waiting session wants to get a lock on.

· From DBA_OBJECTS you can see who is the owner to the table and the table
  name, then you can with the information from the ROW_WAIT* columns build
  the following SELECT (the first argument is 1 for the Oracle8 extended ROWID,
  the second is the object ID, and then the file#, block#, slot#):

  SELECT * FROM scott.emp
   WHERE rowid = DBMS_ROWID.ROWID_CREATE( 1, 2494, 1, 15631, 8 );

===============================================================================

Get information about the enqueues the sessions are holding and waiting on:

  SELECT * 
    FROM v$lock 
   WHERE sid IN( 7, 8 ) 
   ORDER BY sid, type;

ADDR     KADDR    SID  TY    ID1   ID2  LMODE  REQUEST  CTIME  BLOCK
01F27D0C 01F27D20   7  TM   2494     0      3        0   1008      0
01EC9848 01EC9858   7  TX  65546  5989      0        6   1008      0
01F27C98 01F27CAC   8  TM   2494     0      3        0   1032      0
01CD8820 01CD88EC   8  TX  65546  5989      6        0   1032      1

· If a session is waiting to get a lock, then column KADDR contains the
  same value as V$SESSION.LOCKWAIT.

· The type column shows what type of lock the session is holding or waiting on.

· For the lock type TM (table lock), column ID1 is the object_id (or the
  data_object_id) for that table (could be joined with DBA_OBJECTS). For a
  lock of type TX (transaction lock), TRUNC( ID1 / 65536 ) shows which
  rollback segment that is used for the transaction and ID1 - 
  ( rollbacksegment# * 65536 ) shows which slot number in the transaction
  table that this sessions is using (or want to used).

· LMODE shows in which mode the session is holding  the lock.
  0=None, 1=NULL, 2=Row share (SS), 3=Row exclusive (SX), 4=Share (S),
  5=Share row exclusive (SSX), 6=Exclusive (X)

· REQUEST is the mode for which the session wants the lock.

· If a session has values greater than 0 in both LMODE and REQUEST, then the
  session is a Converter.

· CTIME is how many seconds the session has either hold the lock or waited
  for the lock.

· BLOCK is greater than 0 if the lock is blocking other sessions (if there
  are other sessions waiting for this lock to be released).

===============================================================================

What are the sessions doing (which event are the sessions waiting on just
now or last waited on):

  SELECT sid, seq#, event, p1raw, p2raw, p3raw, state, wait_time WAIT,
         seconds_in_wait SECS
    FROM v$session_wait 
   WHERE sid IN( 7, 8 ) 
   ORDER BY sid;

SID SEQ# EVENT                      P1RAW    P2RAW    P3RAW    STATE  WAIT SECS
  7 488 enqueue                     54580006 0001000A 00001765 WAITING   0 1011
  8 147 SQL*Net message from client 28444553 00000001 00       WAITING   0 1035

· SEQ# is increasing by 1 for each wait.

· EVENT is the name of the event (or resource) that the session is waiting on.

· From this you can draw the conclusion that session 8 has locked a row that
  session 7 wants, but session 8 has not done a COMMIT or ROLLBACK (and
  that’s why session 7 is waiting). Session 8 has been idle in 1035 seconds
  (typically the user has gone for a cup of coffee).

===============================================================================

Get some information about the Blocker’s transaction:

  SELECT t.addr, t.xidusn USN, t.xidslot SLOT, t.xidsqn SQL, t.status,
         t.start_time, t.used_ublk UBLK, t.used_urec UREC, t.log_io LOG,
         t.phy_io PHY, t.cr_get, t.cr_change CR_CHA
    FROM v$transaction t, v$session s 
   WHERE s.sid IN( 7, 8 ) AND t.addr = s.taddr;

ADDR    USN SLOT  SQN STATUS START_TIME        UBLK UREC LOG PHY CR_GET CR_CHA
01F2CCBC  1   10 5989 ACTIVE 01/22/00 11:11:28    2   15  54  34     32      0

· ADDR can be joined with V$SESSION.TADDR.

· USN is the Undo Segment Number. Can be joined with V$ROLLNAME.USN and
  V$ROLLSTAT.USN.

· SLOT is the slot number in the rollback segment’s transaction table.

· SQN is the sequence number for the transaction.

· USN+SLOT+SQN are the three values that uniquely identifies a transaction.

· UBLK is how many undo blocks that the transaction has used.

  @updated or deleted. If you are doing an INSERT or DELETE, then you will
 @ see that UREC is set to <number of indexes for this table> + how many rows
 @you inserts/deletes. If you UPDATE a column then UREC will be set to
  @<number of indexes that this column belongs to> * 2 + number of updated
  @rows (so if the column belongs to no index, then UREC is set to the number
  @of rows that was updated).

· If USED_UBLK and USED_UREC is decreasing for a transaction every time you
  look, it means that the transaction is rolling back. When USED_UREC is down
  to zero the rollback is finished. So here you can see when the rollback
  phase is finished after an instance/crash recovery.

===============================================================================

Which SQL statements are the sessions currently executing (or last executed):

 SELECT s.sid, q.users_executing, q.sql_text 
   FROM v$session s, v$sql q
  WHERE s.sid IN( 7, 8 ) AND q.address = s.sql_address 
  ORDER BY s.sid;

SID USERS_EXECUTING SQL_TEXT
  7               1 update scott.emp set job=job where empno=7839
  8               0 SELECT * FROM DUAL

· The SQL statement for the Waiter is the SQL that the user is hanging on.
· The SQL statement for the Blocker may not be the SQL that is holding the
  lock.
· If USERS_EXECUTING greater than 0 then this SQL is executing just now.

===============================================================================

Get all open cursors for the two session:

  SELECT sid, user_name, address, sql_text
    FROM v$open_cursor 
   WHERE sid IN( 7, 8 ) 
   ORDER BY sid;

SID USER_NAME ADDRESS  SQL_TEXT
  7 SYSTEM    0224C9A8 update scott.emp set job=job where empno=7839
  8 SCOTT     02258D80 update emp set sal=sal
  8 SCOTT     02263154 SELECT * FROM DUAL

· Here you can see the SQL that is blocking others (if the session has not
  closed the cursor yet, tools like SQL*Plus closes the cursor automatically
  after execution).

===============================================================================

Get some information about the Blocker’s session so you can call him and tell
him to COMMIT/ROLLBACK his transaction.

  SELECT sid, serial#, username, osuser, program, machine
    FROM v$session 
   WHERE sid = 8;

SID  SERIAL#  USERNAME  OSUSER    PROGRAM                 MACHINE
  8      147  SCOTT     jtreugut  c:\orawin95\bin\plus80  JTREUGUT2

· If you need to kill the session, only do it with ALTER SYSTEM KILL
  SESSION '<sid>, <serial#>' command. The session will then have status
  KILLED or MARKED FOR KILLED (if PMON couldn’t terminate the session directly).
  But the locks on the resources can be held for a long time (until PMON has
  cleaned up everything).
Tracing sessions: waiting on an enqueue [ID 102925.1]
Comments