How to monitor and trace locking problems (enqueues)

posted Mar 9, 2011, 11:05 AM by Sachchida Ojha
What are locks, enqueues and latches?
What are the common types of enqueues?
Has locking been a problem?
Is locking a problem just now?


All those questions will be answered in this document.

To get better performance (get a better flow), one approach is to eliminate the time a session is waiting to get a resource (for example an enqueue). You don’t want sessions that are waiting on other sessions,because those waiting sessions may also hold enqueues and resources, which will make other sessions wait for them - which can end up with a deadlock.

===============================================================================

What are locks, enqueues and latches
-----------------------------------

Often the word "lock" is synonymous to enqueue. In this document, "enqueue"  will be used for the locking mechanism that you can find in V$LOCK. "locking"  will be used when talking about requesting an enqueue, waiting on an enqueue
and blocking others while holding the enqueue.

A lock is a resource that you need to hold to get access to the resource. Oracle has two kind of locks: enqueues and latches. This bulletin will not talk about latches. A latch is a fast, inexpensive and non-sophisticated lock. A latch is used when you need serial access to operations/functions in Oracle. There is no ordered queue for the waiters on a latch - first to grab it gets it.

Enqueues are sophisticated locks for managing access to shared resources (like tables, rows, jobs, redo threads). An enqueue can be requested in different levels/mode: null, row share, row exclusive, share, share row exclusive or exclusive. If a session holds an enqueue in share mode, other sessions can then also take the enqueue in share mode (for the same resource). If a session holds an enqueue in exclusive mode, other sessions that wants to get it - independently in which level - they have to wait.

When access is required by a session, a lock structure is obtained and a request is made to acquire access to the resource at a specific level (mode) is made. The lock structure is placed on one of three linked lists which hang off of the resource, called the OWNER (if the enqueue could be acquired),WAITER (if the session is waiting to acquiring the enqueue) and CONVERTER (the session is holding the enqueue in one level but wants to convert it to another) lists.

An enqueue is uniquely defined by it's TYPE, ID1 and ID2 (which are columns in the V$LOCK view). For example can there only be one enqueue for user SCOTT's EMP table (identified by TYPE=TM, ID1=<object ID>, ID2=0).


Common types of enqueues
------------------------

· JQ - Job Queue. When a job (submitted by DBMS_JOB.SUBMIT) is running,it is protected by a JQ enqueue (which means that only one SNP-process can run the job).

· ST - Space management Transaction. The ST enqueue is need to be held every time the session is allocating/deallocating extents (which means wants to change the UET$ and FET$ dictionary tables), like coalescing,drop/truncate segments and disk -sorting. If the session gets a timeout when requesting the ST enqueue, "ORA-1575 timeout waiting for space management" is returned.

· TM - DML (Table) enqueue. Every time a session wants to lock a table,a TM enqueue is requested. If a session deletes a row in the parent-table (DEPT) and a referential constraint (foreign key) is created without an index on the child-table (EMP), or if the session is updating the column(s) that the foreign key references to then a share lock (level 4) is taken on the child table. If another session tries to do changes to the child-table they have to wait (because they want the enqueue in row exclusive mode, and that is not compatible with the share mode). If an index is created on the child-table’s foreign key-column, then no share-lock is required on the child-table.

· TX - Transaction. As soon as a transaction is started a TX enqueue is needed. A transaction is uniquely defined by the rollback segment number, the slot number in the rollback segment’s transaction table and the slot number’s sequence number. A session can be waiting on a TX enqueue for several reasons:

1) Another session is locking the requested row.

2) When two sessions tries to insert the same unique key into a table (none of them has done a COMMIT), then the last session is waiting for the first one to COMMIT or ROLLBACK. 3) There are no free ITL (Interested Transaction List) in the block header (increase INI_TRANS och PCT_FREE for the segment).

· UL - User Lock. A session has taken a lock with the DBMS_LOCK.REQUEST function.

===============================================================================

Has locking been a problem
--------------------------

To see if the instance has had locking problems since startup, execute the
following SELECT:

SELECT *
FROM v$sysstat
WHERE class=4;

STATISTIC# NAME                     CLASS     VALUE
22 enqueue timeouts             4         0
23 enqueue waits                4         2
24 enqueue deadlocks            4         0
25 enqueue requests             4       213
26 enqueue conversions          4         0
27 enqueue releases             4       204

· "enqueue timeouts" shows the total number of enqueue operations (get and convert) that timed out before they could complete.

· "enqueue waits" shows how many times a session had to wait for an enqueue. This column is updated first after the wait is finished.

FYI:

enqueue waits time out (usually every 3 seconds). Hence a single session  waiting for 1 minute could show up as 20 waits in the "wait event" section of a BSTAT/ESTAT report but only ONE enqueue wait in the
"statistics"(output from V$SYSSTAT) section of the BSTAT/ESTAT.     

· "enqueue deadlocks" is how many times a deadlock situation has occured (every time the client receives an ORA-60 and a trace file will be created). This value should be zero, else an investigation should be made and the trace files should be checked.

· "enqueue requests" minus "enqueue releases" shows how many locks that are held just now (which is equal to the number of rows in V$LOCK).

· "enqueue conversions" is how many times an session is holding a lock in one mode and then wants to change the mode (for example, first the session is doing an SELECT * FROM emp FOR UPDATE and then a UPDATE emp
SET sal=9876 WHERE empno=7839).

· "enqueue releases" shows how many times an enqueue was released (freed).

Maybe the session(s) that created these waits is still connected. If they are, you find them in V$SESSTAT (check first with V$STATNAME or V$SYSSTAT that statistic# 23 is "enqueue waits"):

SELECT *
FROM v$sesstat
WHERE statistic# = 23 AND value > 0;

--------------------------------------------------------------------------------

If a session has been waiting for an enqueue, then a row in V$SYSTEM_EVENT is created. If the instance is running with TIMED_STATISTICS = TRUE, you can see for how long time the session had to wait for the enqueue:

SELECT *
FROM v$system_event
WHERE event = 'enqueue';

EVENT        TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT
enqueue               15             13        4193    279.53333

· TOTAL_WAITS and AVERAGE_WAIT are not showing what you are expecting.
The real number of waits are "enqueue waits" in V$SYSSTAT, or by taking TOTAL_WAITS minus TOTAL_TIMEOUTS. To get the correct AVERAGE_WAIT you should take TIME_WAITED (4193) divided by the real number of waits (2).
So the real average wait time is not 2.79 seconds - it is 20.96 seconds.

With the following SELECT you can see for how long time the connected sessions had to wait for an enqueue and how long the maximum wait time was:

SELECT COUNT( * ), SUM( time_waited ), MAX( max_wait )
FROM v$session_event
WHERE event = 'enqueue';

*) Remember that in V$SYSTEM_EVENT you see the total values for all sessions that has been connected to the instance and those that are still connected. In V$SESSION_EVENT you will only see values for those sessions that are
connected to the instance just now.
-------------------------------------------------------------------------------

To see the gets and waits that had been on the different types of enqueues, check in the fixed internal table X$KSQST:

SELECT *
FROM x$ksqst
WHERE ksqstget > 0;

ADDR          INDX   INST_ID KS  KSQSTGET  KSQSTWAT
01ECBE30        69         1 CF         9         0
01ECBE48        72         1 CI        12         0
01ECBEA8        84         1 CU        57         0
01ECBF60       107         1 DL        28         0
01ECBF90       113         1 DR         1         0
01ECBFB0       117         1 DV        12         0
01ECC498       274         1 IS         5         0
01ECC890       401         1 MR        16         0
01ECCB30       485         1 PF         1         0
01ECCDA0       563         1 RT         1         0
01ECCE88       592         1 SQ         1         0
01ECCEA0       595         1 ST         1         0
01ECCF68       620         1 TM        14         0
01ECCF98       626         1 TS         1         0
01ECCFC0       631         1 TX        13         2
01ECD098       658         1 US        59         0

· The sum of KSQSTWAT is equal to "enqueue waits" in V$SYSSTAT, so here you can see what type of locks that the sessions and background processes had been waiting on.

===============================================================================

Is locking a problem currently
------------------------------

This is an example of how you can trace locking problems.

Background: User SCOTT has updated the salary for all employees in table EMP. Then user SYSTEM tries to update the JOB-description for employee KING in the same table.


The SELECTS
-----------

Which session is the blocker and which sessions are waiting to get the lock?

This SELECT will return no rows if there are no waiters and blockers. You can never find situations when there is only blocker(s) or waiter(s). One session can block many others.

column Username format A15 column Sid format 9990 heading SID
column Type format A4 column Lmode format 990 heading 'HELD'
column Request format 990 heading 'REQ' column Id1 format 9999990
column Id2 format 9999990 break on Id1 skip 1 dup

SELECT SN.Username, M.Sid, M.Type,
DECODE(M.Lmode, 0, 'None', 1, 'Null', 2, 'Row Share', 3, 'Row
Excl.', 4, 'Share', 5, 'S/Row Excl.', 6, 'Exclusive',
LTRIM(TO_CHAR(Lmode,'990'))) Lmode,
DECODE(M.Request, 0, 'None', 1, 'Null', 2, 'Row Share', 3, 'Row
Excl.', 4, 'Share', 5, 'S/Row Excl.', 6, 'Exclusive',
LTRIM(TO_CHAR(M.Request, '990'))) Request,
M.Id1, M.Id2
FROM V$SESSION SN, V$LOCK M
WHERE (SN.Sid = M.Sid and M.Request ! = 0)
or (SN.Sid = M.Sid and M.Request = 0 and Lmode != 4 and (id1, id2)
in (select S.Id1, S.Id2 from V$LOCK S where Request != 0 and S.Id1
= M.Id1 and S.Id2 = M.Id2) ) order by Id1, Id2, M.Request;


USERNAME             SID TY LMODE         REQUEST             ID1        ID2
---------------- ------- -- ------------- -------------  ---------- --------

SCOTT                 9  TX Exclusive     None             262154        117

SCOTT                11  TX None          Exclusive        262154        117

Session 9 is blocking(LMODE=Exclusive)
Session 11 is waiting(REQUEST=Exclusive)
The meaning of ID1 and ID2 depend on the lock TYPE.

· We can see situations where a session is both a Blocker and a Waiter.
· If there are only two sessions and both are Blockers and Waiters then we got a deadlock situation (which Oracle will solve automatically).
· A fast way to check blocking/waiting situations is to run:

SELECT * FROM v$lock WHERE block > 0 OR request > 0;


**NEW for Oracle 10g**
~~~~~~~~~~~~~~~~~~~~~~

V$SESSION now tracks the same wait data as V$SESSION_WAIT

Please see Note:245981.1 10g New Functionality of V$SESSION and WAIT MODEL


===============================================================================

Get some common information about these sessions from V$SESSION:

SELECT sid, taddr, lockwait, status, sql_address,
row_wait_obj# RW_OBJ#, row_wait_file# RW_FILE#, row_wait_block#
RW_BLOCK#, row_wait_row# RW_ROW#
FROM v$session
WHERE sid IN( 7, 8 )
ORDER BY sid;

SID TADDR    LOCKWAIT STATUS   SQL_ADDRESS RW_OBJ# RW_FILE# RW_BLOCK# RW_ROW#
7 01F2D014 01EC9858 ACTIVE      0224C9A8    2494        1     15631       8
8 01F2CCBC          INACTIVE    02258D80      -1        0         0       0

· The column TADDR could be joined with ADDR in V$TRANSACTION.

· The sessions that are waiting to get a lock has a value (IS NOT NULL)
in LOCKWAIT.

· Status for a waiting session is ACTIVE. There are a lot of SQL:s around, that counts the number of active sessions by looking at this column.

· SQL_ADDR is the address for the last SQL statement that the session executed.
Could be joined with V$SQL.ADDRESS and V$OPEN_CURSOR.ADDRESS. For the Waiter this is the SQL-command that made him wait. For the blocker this could be the SQL that makes the block but it could also be any later SQL.

· If ROW_WAIT_OBJ# has not the value -1, then it means that the session is waiting on a lock for this object. Could be joined with DBA_OBJECTS.OBJECT_ID or DBA_OBJECTS.DATA_OBJECT_ID.


select DATA_OBJECT_ID from  dba_objects
where OBJECT_ID = ROW_WAIT_OBJ#;

· The other ROW_WAIT* columns shows where the row is located that the
waiting session wants to get a lock on.

· From DBA_OBJECTS you can see who is the owner to the table and the table name, then you can with the information from the ROW_WAIT* columns build the following SELECT (the first argument is 1 for the Oracle8 extended ROWID,
the second is the object ID, and then the file#, block#, slot#):

SELECT * FROM scott.emp
WHERE rowid = DBMS_ROWID.ROWID_CREATE( 1, 2494, 1, 15631, 8 );

===============================================================================

Get information about the enqueues the sessions are holding and waiting on:

SELECT *
FROM v$lock
WHERE sid IN( 7, 8 )
ORDER BY sid, type;

ADDR     KADDR    SID  TY    ID1   ID2  LMODE  REQUEST  CTIME  BLOCK
01F27D0C 01F27D20   7  TM   2494     0      3        0   1008      0
01EC9848 01EC9858   7  TX  65546  5989      0        6   1008      0
01F27C98 01F27CAC   8  TM   2494     0      3        0   1032      0
01CD8820 01CD88EC   8  TX  65546  5989      6        0   1032      1

· If a session is waiting to get a lock, then column KADDR contains the
same value as V$SESSION.LOCKWAIT.

· The type column shows what type of lock the session is holding or waiting on.

· For the lock type TM (table lock), column ID1 is the object_id (or the data_object_id) for that table (could be joined with DBA_OBJECTS). For a lock of type TX (transaction lock), TRUNC( ID1 / 65536 ) shows which rollback segment that is used for the transaction and ID1 - ( rollbacksegment# * 65536 ) shows which slot number in the transaction table that this sessions is using (or want to used).

· LMODE shows in which mode the session is holding  the lock.
0=None, 1=NULL, 2=Row share (SS), 3=Row exclusive (SX), 4=Share (S),
5=Share row exclusive (SSX), 6=Exclusive (X)

· REQUEST is the mode for which the session wants the lock.

· If a session has values greater than 0 in both LMODE and REQUEST, then the session is a Converter.

· CTIME is how many seconds the session has either hold the lock or waited for the lock.

· BLOCK is greater than 0 if the lock is blocking other sessions (if there are other sessions waiting for this lock to be released).

===============================================================================

What are the sessions doing (which event are the sessions waiting on just now or last waited on):

SELECT sid, seq#, event, p1raw, p2raw, p3raw, state, wait_time WAIT,
seconds_in_wait SECS
FROM v$session_wait
WHERE sid IN( 7, 8 )
ORDER BY sid;

SID SEQ# EVENT                      P1RAW    P2RAW    P3RAW    STATE  WAIT SECS
7 488 enqueue                     54580006 0001000A 00001765 WAITING   0 1011
8 147 SQL*Net message from client 28444553 00000001 00       WAITING   0 1035

· SEQ# is increasing by 1 for each wait.

· EVENT is the name of the event (or resource) that the session is waiting on.

· From this you can draw the conclusion that session 8 has locked a row that session 7 wants, but session 8 has not done a COMMIT or ROLLBACK (and that’s why session 7 is waiting). Session 8 has been idle in 1035 seconds
(typically the user has gone for a cup of coffee).

===============================================================================

Get some information about the Blocker’s transaction:

SELECT t.addr, t.xidusn USN, t.xidslot SLOT, t.xidsqn SQL, t.status,
t.start_time, t.used_ublk UBLK, t.used_urec UREC, t.log_io LOG,
t.phy_io PHY, t.cr_get, t.cr_change CR_CHA
FROM v$transaction t, v$session s
WHERE s.sid IN( 7, 8 ) AND t.addr = s.taddr;

ADDR    USN SLOT  SQN STATUS START_TIME        UBLK UREC LOG PHY CR_GET CR_CHA
01F2CCBC  1   10 5989 ACTIVE 01/22/00 11:11:28    2   15  54  34     32      0

· ADDR can be joined with V$SESSION.TADDR.

· USN is the Undo Segment Number. Can be joined with V$ROLLNAME.USN and
V$ROLLSTAT.USN.

· SLOT is the slot number in the rollback segment’s transaction table.

· SQN is the sequence number for the transaction.

· USN+SLOT+SQN are the three values that uniquely identifies a transaction.

· UBLK is how many undo blocks that the transaction has used.

@updated or deleted. If you are doing an INSERT or DELETE, then you will
@ see that UREC is set to <number of indexes for this table> + how many rows
@you inserts/deletes. If you UPDATE a column then UREC will be set to
@<number of indexes that this column belongs to> * 2 + number of updated
@rows (so if the column belongs to no index, then UREC is set to the number
@of rows that was updated).

· If USED_UBLK and USED_UREC is decreasing for a transaction every time you look, it means that the transaction is rolling back. When USED_UREC is down to zero the rollback is finished. So here you can see when the rollback phase is finished after an instance/crash recovery.

===============================================================================

Which SQL statements are the sessions currently executing (or last executed):

SELECT s.sid, q.users_executing, q.sql_text
FROM v$session s, v$sql q
WHERE s.sid IN( 7, 8 ) AND q.address = s.sql_address
ORDER BY s.sid;

SID USERS_EXECUTING SQL_TEXT
7               1 update scott.emp set job=job where empno=7839
8               0 SELECT * FROM DUAL

· The SQL statement for the Waiter is the SQL that the user is hanging on.
· The SQL statement for the Blocker may not be the SQL that is holding the
lock.
· If USERS_EXECUTING greater than 0 then this SQL is executing just now.

===============================================================================

Get all open cursors for the two session:

SELECT sid, user_name, address, sql_text
FROM v$open_cursor
WHERE sid IN( 7, 8 )
ORDER BY sid;

SID USER_NAME ADDRESS  SQL_TEXT
7 SYSTEM    0224C9A8 update scott.emp set job=job where empno=7839
8 SCOTT     02258D80 update emp set sal=sal
8 SCOTT     02263154 SELECT * FROM DUAL

· Here you can see the SQL that is blocking others (if the session has not closed the cursor yet, tools like SQL*Plus closes the cursor automatically after execution).

===============================================================================

Get some information about the Blocker’s session so you can call him and tell him to COMMIT/ROLLBACK his transaction.

SELECT sid, serial#, username, osuser, program, machine
FROM v$session
WHERE sid = 8;

SID  SERIAL#  USERNAME  OSUSER    PROGRAM                 MACHINE
8      147  SCOTT     jtreugut  c:\orawin95\bin\plus80  JTREUGUT2

· If you need to kill the session, only do it with ALTER SYSTEM KILL SESSION '<sid>, <serial#>' command. The session will then have status KILLED or MARKED FOR KILLED (if PMON couldn’t terminate the session directly).
But the locks on the resources can be held for a long time (until PMON has cleaned up everything).
Comments