Reading EXPLAIN ANALYZE Output in Greenplum database

posted Sep 14, 2012, 4:53 PM by Sachchida Ojha
EXPLAIN ANALYZE causes the statement to be actually executed, not only planned. The EXPLAIN ANALYZE plan shows the actual results along with the planner’s estimates. This is useful for seeing whether the planner’s estimates are close to reality. In addition to the information shown in the EXPLAIN plan, EXPLAIN ANALYZE will show the following additional information:

1. The total elapsed time (in milliseconds) that it took to run the query.

2. The number of workers (segments) involved in a plan node operation. Only segments that return rows are counted.

3. The maximum number of rows returned by the segment that produced the most rows for an operation. If multiple segments produce an equal number of rows, the one with the longest time to end is the one chosen.

4. The segment id number of the segment that produced the most rows for an operation.

5. For relevant operations, the work_mem used by the operation. If work_mem was not sufficient to perform the operation in memory, the plan will show how much data was spilled to disk and how many passes over the data were required for the lowest performing segment. For example:

Work_mem used: 64K bytes avg, 64K bytes max (seg0).
Work_mem wanted: 90K bytes avg, 90K bytes max (seg0) to abate workfile I/O affecting 2 workers.
[seg0] pass 0: 488 groups made from 488 rows; 263 rows written to workfile
[seg0] pass 1: 263 groups made from 263 rows

6. The time (in milliseconds) it took to retrieve the first row from the segment that produced the most rows, and the total time taken to retrieve all rows from that segment. The <time> to first row may be omitted if it is the same as the <time> to end.

EXPLAIN ANALYZE Example

To illustrate how to read an EXPLAIN ANALYZE query plan, we will use the same simple query we used in the “EXPLAIN Example” on page 130. Notice that there is some additional information in this plan that is not in a regular EXPLAIN plan. The parts of the plan in bold show the actual timing and rows returned for each plan node:

EXPLAIN ANALYZE SELECT * FROM names WHERE name = 'Joelle';
QUERY PLAN
------------------------------------------------------------
Gather Motion 2:1 (slice1) (cost=0.00..20.88 rows=1 width=13)
recv: Total 1 rows with 0.305 ms to first row, 0.537 ms to end.
-> Seq Scan on 'names' (cost=0.00..20.88 rows=1 width=13)
Total 1 rows (seg0) with 0.255 ms to first row, 0.486 ms to end.
Filter: name::text ~~ 'Joelle'::text
22.548 ms elapsed

If we read the plan from the bottom up, you will see some additional information for each plan node operation. The total elapsed time it took to run this query was 22.548 milliseconds.

The sequential scan operation had only one segment (seg0) that returned rows, and it returned just 1 row. It took 0.255 milliseconds to find the first row and 0.486 to scan all rows. Notice that this is pretty close to the planner’s estimate — the query planner estimated that it would return one row for this query, which it did. The gather motion operation then received 1 row (segments sending up to the master). The total elapsed time for this operation was 0.537 milliseconds.
Comments