gpperfmon consists of three sets of tables
- now tables store data on current system metrics such as active queries,
- history tables store data on historical metrics,
- and tail tables are for data in transition. These tables are for internal use onlz and should not be queried by endusers.
-- Tail tables are for internal use only and should not be queried by users.
-- The now and tail data are stored as text files on the master host file system, and accessed by the Command Center database via external tables.
-- The history tables are regular database tables stored within the Command Center (gpperfmon) database.
There are the following categories of tables:
•The database_* tables store query workload information for a Greenplum Database instance.
•The emcconnect_history table displays information about ConnectEMC events and alerts. ConnectEMC events are triggered based on a hardware failure, a fix to a failed hardware component, or a Greenplum Database startup. Once an ConnectEMC event is triggered, an alert is sent to EMC Support.
•The filerep_* tables store health and status metrics for the file replication process. This process is how high-availability/mirroring is achieved in Greenplum Database instance. Statistics are maintained for each primary-mirror pair.
•The health_* tables store system health metrics for the EMC Data Computing Appliance.
•The interface_stats_* tables store statistical metrics for each active interface of a Greenplum Database instance. Note: These tables are in place for future use and are not currently populated.
•The iterators_* tables store information about query plan iterators and their metrics. A query iterator refers to a node or operation in a query plan.
•The queries_* tables store high-level query status information.
•The segment_* tables store memory allocation statistics for the Greenplum Database segment instances.
•The socket_stats_* tables store statistical metrics about socket usage for a Greenplum Database instance. Note: These tables are in place for future use and are not currently populated.
•The system_* tables store system utilization metrics.
•The tcp_stats_* tables store statistical metrics about TCP communications for a Greenplum Database instance. Note: These tables are in place for future use and are not currently populated.
•The udp_stats_* tables store statistical metrics about UDP comm unications for a Greenplum Database instance. Note: These tables are in place for future use and are not currently populated.
The Command Center database also contains the following views:
•The dynamic_memory_info view shows an aggregate of all the segments per host and the amount of dynamic memory used per host.
•The iterators_*_rollup set of views summarize the query iterator metrics across all segments in the system.
•The memory_info view shows per-host memory information from the system_history and segment_history tables.
database_*
The database_* tables store query workload information for a Greenplum Database instance. There are three database tables, all having the same columns:
•database_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current query workload data is stored in database_now during the period between data collection from the Command Center agents and automatic commitment to the database_history table.
•database_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for query workload data that has been cleared from database_now but has not yet been committed to database_history. It typically only contains a few minutes worth of data.
•database_history is a regular table that stores historical database-wide query workload data. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
ctime | timestamp | Time this row was created. |
queries_running | int | The total number of queries in Greenplum Database at data collection time. |
queries_queued | int | The number of active queries running at data collection time. |
queries_total | int | The number of queries waiting in a resource queue at data collection time. |
emcconnect_history
The emcconnect_history table displays information about ConnectEMC events and alerts. ConnectEMC events are triggered based on a hardware failure, a fix to a failed hardware component, or a Greenplum Database instance startup. Once an ConnectEMC event is triggered, an alert is sent to EMC Support.
This table is pre-partitioned into monthly partitions. Partitions are automatically added in one month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Note: This table only applies to the Greenplum Data Computing Appliance platform.
Column | Type | Description |
ctime | timestamp(0) without time zone | Time this ConnectEMC event occurred. |
hostname | varchar(64) | The hostname associated with the ConnectEMC event. |
symptom_code | int | A general symptom code for this type of event. For a list of symptom codes, see the EMC Greenplum DCA Installation and Configuration Guide. |
detailed_symptom_code | int | A specific symptom code for this type of event. |
description | text | A description of this type of event, based on the detailed_symptom_code. |
snmp_oid | text | The SNMP object ID of the element/component where the event occurred, where applicable. |
severity | text | The severity level of the ConnectEMC event. One of: WARNING: A condition that might require immediate attention. ERROR: An error occurred on the DCA. System operation and/or performance is likely affected. This alert requires immediate attention. UNKNOWN: This severity level is associated with hosts and devices on the DCA that are either disabled (due to hardware failure) or unreachable for some other reason. This alert requires immediate attention. INFO: A previously reported error condition is now resolved. Greenplum Database startup also triggers an INFO alert. |
status | text | The current status of the system. The status is always OK unless a connection to the server/switch cannot be made, in which case the status is FAILED. |
attempted_transport | boolean | True if an attempt was made to send an alert to EMC support. False if your system was configured not to send alerts. |
message | text | The text of the error message created as a result of this event. |
filerep_*
The database_* tables store high-availability file replication process information for a Greenplum Database instance. There are three database tables, all having the same columns:
•filerep_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current file replication data is stored in filerep_now during the period between data collection from the Command Center agents and automatic commitment to the filerep_history table.
•filerep_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for file replication data that has been cleared from filerep_now but has not yet been committed to filerep_history. It typically only contains a few minutes worth of data.
•filerep_history is a regular table that stores historical database-wide file replication data. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
ctime | timestamp | Time this row was created |
primary_measurement_microsec | bigint | The length of time over which primary metrics (contained in UDP messages) were gathered. |
mirror_measurement_microsec | bigint | The length of time over which mirror metrics (contained in UDP messages) were gathered. |
primary_hostname | varchar(64) | The name of the primary host. |
primary_port | int | The port number of the primary host. |
mirror_hostname | varchar(64) | The name of the mirror host. |
mirror_port | int | The port number of the mirror host. |
primary_write_syscall_bytes_avg | bigint | The average amount of data written to disk on the primary for write system calls per interval. |
primary_write_syscall_byte_max | bigint | The maximum amount of data written to disk on the primary for write system calls per interval. |
primary_write_syscall_microsecs_avg | bigint | The average time required for a write system call to write data to disk on the primary per interval. |
primary_write_syscall_microsecs_max | bigint | The maximum time required for a write system call to write data to disk on the primary per interval. |
primary_write_syscall_per_sec | double precision | The number of write system calls on the primary per second. It reflects only the time to queue the write to disk in memory. |
primary_fsync_syscall_microsec_avg | bigint | The average amount of time required for a file sync system call to write data to disk on the primary per interval. |
primary_fsync_syscall_microsec_max | bigint | The maximum amount of time required for a file sync system call to write data to disk on the primary per interval. |
primary_fsync_syscall_per_sec | double precision | The number of file sync system calls on the primary per second. Unlike write system calls which return immediately after the data is posted/queued, file sync system calls wait for all outstanding writes to be written to disk. File sync system call values in this column reflect actual disk access times for potentially large amounts of data. |
primary_write_shmem_bytes_avg | bigint | The average amount of data written to shared memory on the primary per interval. |
primary_write_shmem_bytes_max | bigint | The maximum amount of data written to shared memory on the primary per interval. |
primary_write_shmem_microsec_avg | bigint | The average amount of time required to write data to shared memory on the primary per interval. |
primary_write_shmem_microsec_max | bigint | The maximum amount of time required to write data to shared memory on the primary per interval. |
primary_write_shmem_per_sec | double precision | The number of writes to shared memory on the primary per second. |
primary_fsync_shmem_microsec_avg | bigint | The average amount of time required by the file sync system call to write data to shared memory on the primary per interval. |
primary_fsync_shmem_microsec_max | bigint | The maximum amount of time required by the file sync system call to write data to shared memory on the primary per interval. |
primary_fsync_shmem_per_sec | double precision | The number of file sync calls to shared memory on the primary per second. File sync system call values in this column reflect actual disk access times for potentially large amounts of data. |
primary_write_shmem_per_sec | double precision | The number of writes to shared memory on the primary per second. |
primary_fsync_shmem_microsec_avg | bigint | The average amount of time required by the file sync system call to write data to shared memory on the primary per interval. |
primary_fsync_shmem_microsec_max | bigint | The maximum amount of time required by the file sync system call to write data to shared memory on the primary per interval. |
primary_fsync_shmem_per_sec | double precision | The number of file sync calls to shared memory on the primary per second. File sync system call values in this column reflect actual disk access times for potentially large amounts of data. |
primary_roundtrip_fsync_msg_microsec_avg | bigint | The average amount of time required for a roundtrip file sync between the primary and the mirror per interval. This includes: 1. The queueing of a file sync message from the primary to the mirror. 2. The message traversing the network. 3. The execution of the file sync by the mirror. 4. The file sync acknowledgement traversing the network back to the primary. |
primary_roundtrip_fsync_msg_microsec_max | bigint | The maximum amount of time required for a roundtrip file sync between the primary and the mirror per interval. This includes: 1. The queueing of a file sync message from the primary to the mirror. 2. The message traversing the network. 3. The execution of the file sync by the mirror. 4. The file sync acknowledgement traversing the network back to the primary. |
primary_roundtrip_fsync_msg_per_sec | double precision | The number of roundtrip file syncs per second. |
primary_roundtrip_test_msg_microsec_avg | bigint | The average amount of time required for a roundtrip test message between the primary and the mirror to complete per interval. This is similar to primary_roundtrip_fsync_msg_microsec_avg, except it does not include a disk access component. Because of this, this is a useful metric that shows the average amount of network delay in the file replication process. |
primary_roundtrip_test_msg_microsec_max | bigint | The maximum amount of time required for a roundtrip test message between the primary and the mirror to complete per interval. This is similar to primary_roundtrip_fsync_msg_microsec_max, except it does not include a disk access component. Because of this, this is a useful metric that shows the maximum amount of network delay in the file replication process. |
primary_roundtrip_test_msg_per_sec | double precision | The number of roundtrip file syncs per second. This is similar to primary_roundtrip_fsync_msg_per_sec, except it does not include a disk access component. As such, this is a useful metric that shows the amount of network delay in the file replication process. Note that test messages typically occur once per minute, so it is common to see a value of “0” for time periods not containing a test message. |
mirror_write_syscall_size_avg | bigint | The average amount of data written to disk on the mirror for write system calls per interval. |
mirror_write_syscall_size_max | bigint | The maximum amount of data written to disk on the mirror for write system calls per interval. |
mirror_write_syscall_microsec_avg | bigint | The average time required for a write system call to write data to disk on the mirror per interval. |
mirror_write_syscall_microsec_max | bigint | The maximum time required for a write system call to write data to disk on the mirror per interval. |
primary_roundtrip_test_msg_per_sec | double precision | The number of roundtrip file syncs per second. This is similar to primary_roundtrip_fsync_msg_per_sec, except it does not include a disk access component. As such, this is a useful metric that shows the amount of network delay in the file replication process. Note that test messages typically occur once per minute, so it is common to see a value of “0” for time periods not containing a test message. |
mirror_write_syscall_size_avg | bigint | The average amount of data written to disk on the mirror for write system calls per interval. |
mirror_write_syscall_size_max | bigint | The maximum amount of data written to disk on the mirror for write system calls per interval. |
mirror_write_syscall_microsec_avg | bigint | The average time required for a write system call to write data to disk on the mirror per interval. |
health_*
The health_* tables store system health metrics for the EMC Data Computing Appliance. There are three system tables, all having the same columns:
Note: This table only applies to the Greenplum Data Computing Appliance platform.
•health_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current system health data is stored in system_now during the period between data collection from the Command Center agents and automatic commitment to the system_history table.
•health_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for system health data that has been cleared from system_now but has not yet been committed to system_history. It typically only contains a few minutes worth of data.
•health_history is a regular table that stores historical system health metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
ctime | timestamp(0) without time zone | Time this snapshot of health information about this system was created. |
hostname | varchar(64) | Segment or master hostname associated with this health information. |
symptom_code | int | The symptom code related to the current health/status of an element or component of the system. |
detailed_symptom_code | int | A more granular symptom code related to the health/status of a element or component of the system. |
description | text | A description of the health/status of this symptom code. |
snmp_oid | text | The SNMP object ID of the element/component where the event occurred, where applicable. |
status | text | The current status of the system. The status is always OK unless a connection to the server/switch cannot be made, in which case the status is FAILED. |
message | | The text of the error message created as a result of this event. |
interface_stats_*
The interface_stats_* tables store statistical metrics about communications over each active interface for a Greenplum Database instance.
Note: These tables are in place for future use and are not currently populated.
There are three system tables, all having the same columns:
•interface_stats_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data.
•interface_stats_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for statistical
interface metrics that has been cleared from interface_stats_now but has not yet been committed to interface_stats_history. It typically only contains a few minutes worth of data.
•interface_stats_history is a regular table that stores statistical interface metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in one month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
interface_name | string | Name of the interface. For example: eth0, eth1, lo. |
bytes_received | bigint | Amount of data received in bytes. |
packets_received | bigint | Number of packets received. |
receive_errors | bigint | Number of errors encountered while data was being received. |
receive_drops | bigint | Number of times packets were dropped while data was being received. |
receive_fifo_errors | bigint | Number of times FIFO (first in first out) errors were encountered while data was being received. |
receive_frame_errors | bigint | Number of frame errors while data was being received. |
receive_compressed_packets | int | Number of packets received in compressed format.
|
receive_multicast_packets | int | Number of multicast packets received.
|
bytes_transmitted | bigint | Amount of data transmitted in bytes.
|
packets_transmitted | bigint | Number of packets transmitted. |
transmit_errors | bigint | Number of errors encountered during data transmission. |
transmit_drops | bigint | Number of times packets were dropped during data transmission. |
transmit_fifo_errors | bigint | Number of times fifo errors were encountered during data transmission. |
transmit_collision_errors | bigint | Number of times collision errors were encountered during data transmission |
transmit_carrier_errors | bigint | Number of times carrier errors were encountered during data transmission.
|
transmit_compressed_packets | int | Number of packets transmitted in compressed format. |
iterators_*
The iterators_* tables store information about query plan iterators and their metrics. A query iterator refers to a node or operation in a query plan. For example, a sequential scan operation on a table may be one type of iterator in a particular query plan.
The tmid, ssid and ccnt columns are the composite key that uniquely identifies a particular query. These columns can be used to join with the queries_* data tables.
There are three iterator tables, all having the same columns:
•iterators_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current query plan iterator data is stored in iterators_now during the period between data collection from the Command Center agents and automatic commitment to the iterators_history table.
•iterators_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for query plan iterator data that has been cleared from iterators_now but has not yet been committed to iterators_history. It typically only contains a few minutes worth of data.
•iterators_history is a regular table that stores historical query plan iterator data. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
See also the iterator_rollup views for summary metrics of the query plan iterator data.
Column | Type | Description |
ctime | timestamp | Time this row was created. |
tmid | int | A time identifier for a particular query. All iterator records associated with the query will have the same tmid. |
ssid | int | The session id as shown by the gp_session_id parameter. All iterator records associated with the query will have the same ssid. |
ccnt | in | The command number within this session as shown by gp_command_count parameter. All iterator records associated with the query will have the same ccnt. |
segid | int | The segment ID (dbid from gp_segment_configuration). |
pid | int | The postgres process ID for this iterator. |
nid | int | The query plan node ID from the Greenplum slice plan. |
pnid | int | The parent query plan node ID from the Greenplum slice plan. |
hostname | varchar(64) | Segment hostname. |
ntype | varchar(64) | The iterator operation type. Possible values are listed in “Iterator Metrics”. |
nstatus | varchar(64) | The status of this iterator. Possible values are: Initialize, Executing and Finished. |
tstart | timestamp | Start time for the iterator. |
tduration | int | Duration of the execution. |
pmemsize | bigint | Work memory allocated by the Greenplum planner to this iterator’s query process. |
pmemmax | bigint | Maximum work memory allocated by the Greenplum planner to this iterator’s query process. |
memsize | bigint | OS memory allocated to this iterator’s process. |
memresid | bigint | Resident memory allocated to this iterator’s process (as opposed to shared memory). |
memshare | bigint | Shared memory allocated to this iterator’s process. |
cpu_elapsed | bigint | Total CPU usage of the process executing the iterator. |
cpu_currpct | float | The percentage of CPU currently being utilized by this iterator process. This value is always zero for historical (completed) iterators. rowsout bigint The actual number of rows output by the iterator. |
rowsout_est | bigint | The query planner’s estimate of rows output by the iterator. |
m0_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. For all operations, this metric name is Rows In. |
m0_unit | varchar(64) | The unit of measure for this metric. For all operations (ntype), this unit of measure is Rows. |
m0_val | bigint | The value of this metric. |
m0_est | bigint | The estimated value of this metric. |
m1_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m1_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m1_val | bigint | The value of this metric. |
m1_est | bigint | The estimated value of this metric. |
m2_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m2_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m2_val | bigint | The value of this metric. |
m2_est | bigint | The estimated value of this metric. |
m3_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m3_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m3_val | bigint | The value of this metric. |
m3_est | bigint | The estimated value of this metric. |
m4_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m4_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m4_val | bigint | The value of this metric. |
m4_est | bigint | The estimated value of this metric. |
m5_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m5_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m5_val | bigint | The value of this metric. |
m5_est | bigint | The estimated value of this metric. |
m6_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m6_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m6_val | bigint | The value of this metric. |
m6_est | bigint | The estimated value of this metric. |
m7_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m7_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m7_val | bigint | The value of this metric. |
m7_est | bigint | The estimated value of this metric. |
m8_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m8_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m8_val | bigint | The actual value of this metric. |
m8_est | bigint | The estimated value of this metric. |
m9_name | varchar(64) | Each operation in a query plan (ntype) has metrics associated with it. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m9_unit | varchar(64) | The unit of measure for this metric. See Iterator Metrics for a complete list of iterator attributes and their corresponding units. |
m9_val | bigint | The actual value of this metric. |
m9_est | bigint | The estimated value of this metric. |
m10_name - m15_name | varchar(64) | The iterator name (ntype) associated with this metric. Metrics m10 through m15 are currently not used. |
m10_unit - m15_unit | varchar(64) | The unit of measure for this metric. Metrics m10 through m15 are currently not used. |
m10_value - m15_value | bigint | The actual value of this metric. Metrics m10 through m15 are currently not used. |
m10_est - m15_est | bigint | The estimated value of this metric. Metrics m10 through m15 are currently not used. |
t0_name | varchar(64) | This column is a label for t0_val. Its value is always Name. |
t0_val | varchar(128) | The name of the relation being scanned by an iterator. This metric is collected only for iterators that perform scan operations such as a sequential scan or function scan. |
Iterator Metrics
The tables in this section list all possible iterators in a query on Greenplum Database instance. The iterator tables include the metric name, the column in the iterators_* table in the gpperfmon database where the metric appears, how the metric is measured (unit), and a description of the metric. The following iterators are listed:
- Append
- Append-only Scan
- Append-only Columnar Scan
- Aggregate
- BitmapAnd
- BitmapOr
- Bitmap Append-Only Scan
- Bitmap Heap Scan
- Bitmap Index Scan
- Broadcast Motion
- Explicit Redistribute Motion
- External Scan
- Function Scan
- Gather Motion
- GroupAggregate
- Hash Join
- HashAggregate
- Index Scan
- Limit
- Materialize
- Merge Join
- Nested Loop
- Redistribute Motion
- Result
- Repeat
- Seq Scan
- SetOp
- Shared Scan
- Sort
- Subquery Scan
- Tid Scan
- Unique
- Values Scan
- Window
Metric Terminology
The following information explains some of the database terms and concepts that appear in iterator metrics in Greenplum Database:
Node: Refers to a step in a query plan. A query plan has sets of operations that Greenplum Database performs to produce the answer to a given query. A node in the plan represents a specific database operation, such as a table scan, join, aggregation, sort, etc.
Iterator: Represents the actual execution of the node in a query plan. Node and iterator are sometimes used interchangeably.
Tuple: Refers to a row returned as part of a result set from a query, as well as a record in a table.
Spill: When there is not enough memory to perform a database operation, data must be written (or spilled) to disk.
Passes: Occur when an iterator must scan (or pass) over spilled data to obtain a result. A pass represents one pass through all input tuples, or all data in batch files generated after spill, which happens hierarchically. In the first pass, all input tuples are read, and intermediate results are spilled to a specified number of batch files. In the second pass, the data in all batch files is processed. If the results are still too large to store in memory, the intermediate results are spilled to the second level of spill files, and the process repeats again.
Batches: Refers to the actual files created when data is spilled to disk. This is most often associated to Hash operations.
Join: This clause in a query joins two or more tables. There are three types of Join algorithms in Greenplum Database instance:
•Hash Join
•Merge Join
•Nested Loop
Each of these operations include their own respective Join semantics. The Command Center Console displays iterator metrics for each of these semantics.
Append
An Append iterator has two or more input sets. Append returns all rows from the first input set, then all rows from the second input set, and so on, until all rows from all input sets are processed. Append is also used when you select from a table involved in an inheritance hierarchy.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator. |
Append Current Input Source | m1_name | Inputs | The number of the current table being scanned. |
Append-only Scan
This iterator scans append-only type-tables.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows in | The number of tuples received by the iterator. |
Append-only Scan Rescan | m1_name | Rescans | The number of append-only rescans by this iterator. |
Append-only Columnar Scan
This iterator scans append-only columnar-type tables.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator. |
Append-Only Columnar Scan Rescan | m1_name | Rescans | The number of append-only columnar rescans by this iterator. |
Aggregate
The query planner produces an aggregate iterator whenever the query includes an aggregate function. For example, the following functions are aggregate functions: AVG(), COUNT(), MAX(), MIN(), STDDEV(), SUM(), and VARIANCE(). Aggregate reads all the rows in the input set and computes the aggregate values. If the input set is not grouped, Aggregate produces a single result row.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator. |
Aggregate Total Spill Tuples | m1_name | Tuples | The number of tuples spilled to disk |
Aggregate Total Spill Bytes | m2_name | Bytes | The number of bytes spilled to disk. |
Aggregate Total Spill Batches | m3_name | Batches | The number of spill batches required. |
Aggregate Total Spill Pass | m4_name | Passes | The number of passes across all of the batches. |
Aggregate Current Spill Pass Read Tuples | m5_name | Tuples | The number of bytes read in for this spill batch. |
Aggregate Current Spill Pass Read Bytes | m6_name | Bytes | The number of tuples read in for this spill batch. |
Aggregate Current Spill Pass Tuples | m7_name | Tuples | The number of tuples that are in each spill file in the current pass. |
Aggregate Current Spill Pass Bytes | m8_name | Bytes | The number of bytes that are in each spill file in the current pass. |
Aggregate Current Spill Pass Batches | m9_name | Batches | The number of batches created in the current pass. |
BitmapAnd
This iterator takes the bitmaps generated from multiple BitmapIndexScan iterators, puts them together with an AND clause, and generates a new bitmap as its output.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator. |
BitmapOr
This iterator takes the bitmaps generated from multiple BitmapIndexScan iterators, puts them together with an OR clause, and generates a new bitmap as its output.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Bitmap Append-Only Scan
This iterator retrieves all rows from the bitmap generated by BitmapAnd, BitmapOr, or BitmapIndexScan and accesses the append-only table to retrieve the relevant rows.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Bitmap Heap Scan
This iterator retrieves all rows from the bitmap generated by BitmapAnd, BitmapOr, or BitmapIndexScan and accesses the heap table to retrieve the relevant rows.
Table 16 Bitmap Heap Scan Iterator Metrics
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Bitmap Heap Scan Pages | m1_name | Pages | The number of bitmap heap pages scanned. |
Bitmap Heap Scan Rescan | m2_name | Rescans | The number of bitmap heap page rescans by this iterator. |
Bitmap Index Scan
This iterator produces a bitmap that corresponds to the rules that satisfy the query plan.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Bitmap Index Scan Rescan | m1_name | Rescans | The number of bitmap index rescans by this iterator. |
Broadcast Motion
This iterator sends an outbound tuple to all of its destinations.
Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the "Ack Time" values should be very small (microseconds or milliseconds). However if the "Ack Time" values are one or more seconds (particularly the "Motion Min Ack Time" metric), then a network performance issue likely exists.
Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Motion Bytes Sent | m1_name | Bytes | The number of bytes sent by the iterator. |
Motion Total Ack Time | m2_name | Microseconds | The total amount of time that the iterator waited for an acknowledgement after sending a packet of data.
|
Motion Average Ack Time | m3_name | Microseconds | The average amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Max Ack Time | m4_name | Microseconds | The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Min Ack Time | m5_name | Microseconds | The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Count Resent | m6_name | Packets | The total number of packets that the iterator did not acknowledge when they first arrived in the queue. |
Motion Max Resent | m7_name | Packets | The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of "10" indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator. |
Motion Bytes Received | m8_name | Bytes | The number of bytes received by the iterator. |
Motion Count Dropped | m9_name | Packets | The number of packets dropped by the iterator because of buffer overruns |
.
Explicit Redistribute Motion
The Explicit Redistribute iterator moves tuples to segments explicitly specified in the segment ID column of the tuples. This differs from a Redistribute Motion iterator, where target segments are indirectly specified through hash expressions. The Explicit Redistribute iterator is used when the query portion of a DML planned statement requires moving tuples across distributed tables.
Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the "Ack Time" values should be very small (microseconds or milliseconds). However if the "Ack Time" values are one or more seconds (particularly the "Motion Min Ack Time" metric), then a network performance issue likely exists.
Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator. |
Motion Bytes Sent | m1_name | Bytes | The number of bytes sent by the iterator. |
Motion Total Ack Time | m2_name | Microseconds | The total amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Average Ack Time | m3_name | Microseconds | The average amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Max Ack Time | m4_name | Microseconds | The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Min Ack Time | m5_name | Microseconds | The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Count Resent | m6_name | Packets | The total number of packets that the iterator did not acknowledge when they first arrived in the queue. |
Motion Max Resent | m7_name | Packets | The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of "10" indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator. |
Motion Bytes Received | m8_name | Bytes | The number of bytes received by the iterator. |
Motion Count Dropped | m9_name | Packets | The number of packets dropped by the iterator because of buffer overruns. |
External Scan
This iterator scans an external table.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
External Scan Rescan | m1_name | Rescans | The number of external table rescans by this iterator |
Function Scan
This iterator returns tuples produced by a function
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
.
Gather Motion
This iterator gathers streams of tuples that are sent by "sending" motions. If a merge key is specified, it merges many streams into a single order-preserved stream.
Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the "Ack Time" values should be very small (microseconds or milliseconds). However if the "Ack Time" values are one or more seconds (particularly the "Motion Min Ack Time" metric), then a network performance issue likely exists.
Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Motion Bytes Sent | m1_name | Bytes | The number of bytes sent by the iterator. |
Motion Total Ack Time | m2_name | Microseconds | The total amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Average Ack Time | m3_name | Microseconds | The average amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Max Ack Time | m4_name | Microseconds | The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Min Ack Time | m5_name | Microseconds | The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Count Resent | m6_name | Packets | The total number of packets that the iterator did not acknowledge when they first arrived in the queue. |
Motion Max Resent | m7_name | Packets | The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of "10" indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator. |
Motion Bytes Received | m8_name | Bytes | The number of bytes received by the iterator. |
Motion Count Dropped | m9_name | Packets | The number of packets dropped by the iterator because of buffer overruns. |
GroupAggregate
The GroupAggregate iterator is a way to compute vector aggregates, and it is used to satisfy a GROUP BY clause. A single input set is required by the GroupAggregate iterator, and it must be ordered by the grouping column(s). This iterator returns a single row for a unique value of grouping columns.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Aggregate Total Spill Tuples | m1_name | Tuples | The number of tuples spilled to disk |
Aggregate Total Spill Bytes | m2_name | Bytes | The number of bytes spilled to disk. |
Aggregate Total Spill Batches | m3_name | Batches | The number of spill batches required. |
Aggregate Total Spill Pass | m4_name | Passes | The number of passes across all of the batches |
Aggregate Current Spill Pass Read Tuples | m5_name | Tuples | The number of bytes read in for this spill batch |
Aggregate Current Spill Pass Read Bytes | m6_name | Bytes | The number of tuples read in for this spill batch |
Aggregate Current Spill Pass Tuples | m7_name | Tuples | The number of tuples that are in each spill file in the current pass. |
Aggregate Current Spill Pass Bytes | m8_name | Bytes | The number of bytes that are in each spill file in the current pass. |
Aggregate Current Spill Pass Batches | m9_name | Batches | The number of batches created in the current pass. |
Hash Join
The Hash Join iterator requires two input sets - the outer and inner tables.
The Hash Join iterator starts by creating its inner table using the Hash operator. The Hash operator creates a temporary Hash index that covers the join column in the inner table. When the hash table (that is, the inner table) is created, Hash Join reads each row in the outer table, hashes the join column (from the outer table), and searches the temporary Hash index for a matching value.
In a Greenplum Database instance, a Hash Join algorithm can be used with the following join semantics:
- Left Join
- Left Anti Semi Join
- Full Join
- Right Join
- EXISTS Join
- Reverse In Join
- Unique Inner Join
- Unique Outer Join
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Hash Spill Batches | m1_name | Batches | The current batch being spilled. |
Hash Spill Tuples | m2_name | Tuples | The current number of spilled tuples. |
Hash Spill Bytes | m3_name | Bytes | The current number of bytes spilled to disk. |
HashAggregate
The HashAggregate iterator is similar to the GroupAggregate iterator. A single input set is required by the HashAggregate iterator and it creates a hash table from the input. However, it does not require its input to be ordered.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Aggregate Total Spill Tuples | m1_name | Tuples | The number of tuples spilled to disk. |
Aggregate Total Spill Bytes | m2_name | Bytes | The number of bytes spilled to disk. |
Aggregate Total Spill Batches | m3_name | Batches | The number of spill batches required. |
Aggregate Total Spill Pass | m4_name | Passes | The number of passes across all of the batches. |
Aggregate Current Spill Pass Read Tuples | m5_name | Tuples | The number of bytes read in for this spill batch. |
Aggregate Current Spill Pass Read Bytes | m6_name | Bytes | The number of tuples read in for this spill batch. |
Aggregate Current Spill Pass Tuples | m7_name | Tuples | The number of tuples that are in each spill file in the current pass. |
Aggregate Current Spill Pass Bytes | m8_name | Bytes | The number of bytes that are in each spill file in the current pass. |
Aggregate Current Spill Pass Batches | m9_name | Batches | The number of batches created in the current pass. |
Index Scan
An Index Scan operator traverses an index structure. If you specify a starting value for an indexed column, the Index Scan will begin at the appropriate value. If you specify an ending value, the Index Scan will complete as soon as it finds an index entry greater than the ending value. A query planner uses an Index Scan operator when it can reduce the size of the result set by traversing a range of indexed values, or when it can avoid a sort because of the implicit ordering offered by an index.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Index Scan Restore Pos | m1_name | Restores | The number of restores. |
Index Scan Rescan | m2_name | Rescans | The number of rescans. |
Limit
The Limit operator is used to limit the size of a result set. A Greenplum Database instance uses the Limit operator for both Limit and Offset processing. The Limit operator works by discarding the first x rows from its input set, returning the next y rows, and discarding the remainder. If the query includes an OFFSET clause, x represents the offset amount; otherwise, x is zero. If the query includes a LIMIT clause, y represents the Limit amount; otherwise, y is at least as large as the number of rows in the input set
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Materialize
The materialize iterator is used for some sub-select operations. The query planner can decide that it is less expensive to materialize a sub-select one time than it is to repeat the work for each top-level row. Materialize is also used for some merge/join operations.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Materialize Rescan | m1_name | Rescans | The number of times the executor requested to rescan the date for this iterator |
.
Merge Join
The Merge Join iterator joins two tables. Like the Nested Loop iterator, Merge Join requires two input sets: An outer table and an inner table. Each input set must be ordered by the join columns. In a Greenplum Database instance, the Merge Join algorithm can be used with the following join semantics:
•Left Join
•Left Anti Semi Join
•Full Join
•Right Join
•EXISTS Join
•Reverse In Join
•Unique Outer joins
•Unique Inner Join
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Merge Join Inner Tuples | m1_name | Tuples | The number of rows from the inner part of the query plan. |
Merge Join Outer Tuples | m2_name | Tuples | The number of rows from the Outer part of the query plan. |
Nested Loop
The Nested Loop iterator is used to perform a join between two tables, and as a result requires two input sets. It fetches each table from one of the input sets (called the outer table). For each row in the outer table, the other input (called the inner table) is searched for a row that meets the join qualifier. In a Greenplum Database instance, a Merge Join algorithm can be used with the following join semantics:
•Left Join
•Left Anti Semi Join
•Full Join
•Right Join
•EXISTS Join
•Reverse In Join
•Unique Outer Join
•Unique Inner Join
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Nested Loop Inner Tuples | m1_name | Tuples | The number of rows from the inner part of the query plan. |
Nested Loop Outer Tuples | m2_name | Tuples | The number of rows from the outer part of the query plan. |
Redistribute Motion
This iterator sends an outbound tuple to only one destination based on the value of a hash.
Note that the Motion metrics for the iterator are useful when investigating potential networking issues in the Greenplum Database system. Typically, the "Ack Time" values should be very small (microseconds or milliseconds). However if the "Ack Time" values are one or more seconds (particularly the "Motion Min Ack Time" metric), then a network performance issue likely exists.
Also, if there are a large number of packets being dropped because of queue overflow, you can increase the value for the gp_interconnect_queue_depth system configuration parameter to improve performance. See the Greenplum Database Administrator Guide for more in formation about system configuration parameters.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Motion Bytes Sent | m1_name | Bytes | The number of bytes sent by the iterator. |
Motion Total Ack Time | m2_name | Microseconds | The total amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Average Ack Time | m3_name | Microseconds | The average amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Max Ack Time | m4_name | Microseconds | The maximum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Min Ack Time | m5_name | Microseconds | The minimum amount of time that the iterator waited for an acknowledgement after sending a packet of data. |
Motion Count Resent | m6_name | Packets | The total number of packets that the iterator did not acknowledge when they first arrived in the queue. |
Motion Max Resent | m7_name | Packets | The maximum number of packets that the iterator did not acknowledge when they first arrived in the queue. This metric is applied on a per packet basis. For example, a value of "10" indicates that a particular packet did not get acknowledged by this iterator 10 times, and that this was the maximum for this iterator.
|
Motion Bytes Received | m8_name | Bytes | The number of bytes received by the iterator. |
Motion Count Dropped | m9_name | Packets | The number of packets dropped by the iterator because of buffer overruns. |
Result
The Result iterator is used to either (1) execute a query that does not retrieve data from a table, or evaluate the parts of a WHERE clause that do not depend on data retrieved from a table. It can also be used if the top node in the query plan is an Append iterator
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
.
Repeat
This iterator repeats every input operator a certain number of times. This is typically used for certain grouping operations.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Seq Scan
The Seq Scan iterator scans heap tables, and is the most basic query iterator. Any single-table query can be done by using the Seq Scan iterator. Seq Scan starts at the beginning of a heap table and scans to the end of the heap table. For each row in the heap table, Seq Scan evaluates the query constraints (the WHERE clause). If the constraints are satisfied, the required columns are added to the result set.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Seq Scan Page Stats | m1_name | Pages | The number of pages scanned. |
Seq Scan Restore Pos | m2_name | Restores | The number of times the executor restored the scan position. |
Seq Scan Rescan | m3_name | Rescans | The number of times the executor requested to rescan the date for this iterator. |
SetOp
There are four SetOp iterators:
•Intersect
•Intersect All
•Except
•Except All
These iterators are produced only when the query planner encounters an INTERSECT, INTERSECT ALL, EXCEPT, or EXCEPT ALL clause, respectively.
All SetOp iterators require two input sets. They combine the input sets into a sorted list, and then groups of identical rows are identified. For each group, the SetOp iterators counts the number of rows contributed by each input set, then uses the counts to determine the number of rows to add to the result set.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Shared Scan
This iterator evaluates the common parts of a query plan. It shares the output of the common sub-plans with all other iterators, so that the sub-plan only needs to execute one time.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Seq Scan Page Stats | m1_name | Seq Scan Page Stats | The number of pages scanned. |
Seq Scan Restore Pos | m2_name | Seq Scan Restore Pos | The number of times the executor restored the scan position. |
Seq Scan Rescan | m3_name | Seq Scan Rescan | The number of times the executor requested to rescan the date for this iterator. |
Sort
The Sort iterator imposes an ordering on the result set. A Greenplum Database instance uses two different sort strategies: An in-memory sort and an on-disk sort. If the size of the result set exceeds the available memory, the Sort iterator distributes the input set to a collection of sorted work files and then merges the work files back together again. If the result set is less than the available memory, the sort is done in memory.
The Sort iterator is used for many purposes. A Sort can be used to satisfy an ORDER BY clause. Also, some query operators require their input sets to be ordered.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Sort Memory Usage | m1_name | Bytes | The number of bytes used by the sort. |
Sort Spill Tuples | m2_name | Tuples | The current number of spilled tuples. |
Sort Spill Bytes | m3_name | Bytes | The current number of spilled bytes. |
Sort Spill Pass | m4_name | Passes | The number of merge passes. Each merge pass merges several sorted runs into larger runs. |
Sort Current Spill Pass Tuples | m5_name | Tuples | The number of tuples spilled in the current spill pass. |
Sort Current Spill Pass Bytes | m6_name | Bytes | The number of bytes spilled in the current spill pass. |
Subquery Scan
A Subquery Scan iterator is a pass-through iterator. It scans through its input set, adding each row to the result set. This iterator is used for internal purposes and has no affect on the overall query plan.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Subquery Scan Rescan | m1_name | Rescans | The number of times the executor requested to rescan the date for this iterator. |
Tid Scan
The Tid Scan (tuple ID scan) iterator is used whenever the query planner encounters a constraint of the form ctid = expression or expression = ctid. This specifies a tuple ID, an identifier that is unique within a table. The tuple ID works like a bookmark, but is valid only within a single transaction. After the transaction completes, the tuple ID is not used again.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Unique
The Unique iterator eliminates duplicate values from the input set. The input set must be ordered by the columns, and the columns must be unique. The Unique operator removes only rows — it does not remove columns and it does not change the ordering of the result set. Unique can return the first row in the result set before it has finished processing the input set. The query planner uses the Unique operator to satisfy a DISTINCT clause. Unique is also used to eliminate duplicates in a UNION.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Values Scan
The Value Scan iterator is used to iterate over a set of constant tuples.
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
Window
The Window function performs calculations across sets of rows that are related to the current query row. The Window iterator computes Window functions on the input set of rows
Metric | Metric Column | Unit | Description |
Rows in | m0_name | Rows | The number of tuples received by the iterator.
|
queries_*
The queries_* tables store high-level query status information.
The tmid, ssid and ccnt columns are the composite key that uniquely identifies a particular query. These columns can be used to join with the iterators_* tables.
There are three queries tables, all having the same columns:
•queries_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current query status is stored in queries_now during the period between data collection from the Command Center agents and automatic commitment to the queries_history table.
•queries_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for query status data that has been cleared from queries_now but has not yet been committed to queries_history. It typically only contains a few minutes worth of data.
•queries_history is a regular table that stores historical query status data. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
ctime | timestamp | Time this row was created. |
tmid | int | A time identifier for a particular query. All records associated with the query will have the same tmid. |
ssid | int | The session id as shown by gp_session_id. All records associated with the query will have the same ssid. |
ccnt | int | The command number within this session as shown by gp_command_count. All records associated with the query will have the same ccnt. |
username | varchar(64) | Greenplumrole name that issued this query. |
db | varchar(64) | Name of the database queried. |
cost | integer | Not implemented in this release. |
tsubmit | timestamp | Time the query was submitted. |
tstart | timestamp | Time the query was started. |
tfinish | timestamp | Time the query finished. |
status | varchar(64) | Status of the query -- start, done, or abort. |
rows_out | bigin | Rows out for the query. |
cpu_elapsed | bigint | CPU usage by all processes across all segments executing this query (in seconds). It is the sum of the CPU usage values taken from all active primary segments in the database system. Note that Greenplum Command Center logs the value as 0 if the query runtime is shorter than the value for the quantum. This occurs even if the query runtime is greater than the values for min_query_time and min_detailed_query, and these values are lower than the value for the quantum. |
cpu_currpct | float | Current CPU percent average for all processes executing this query. The percentages for all processes running on each segment are averaged, and then the average of all those values is calculated to render this metric. Current CPU percent average is always zero in historical and tail data. |
skew_cpu | float | Displays the amount of processing skew in the system for this query. Processing/CPU skew occurs when one segment performs a disproportionate amount of processing for a query. This value is the coefficient of variation in the CPU% metric of all iterators across all segments for this query, multiplied by 100. For example, a value of .95 is shown as 95. skew_rows float Displays the amount of row skew in the system. Row skew occurs when one segment produces a disproportionate number of rows for a query. This value is the coefficient of variation for the rows_in metric of all iterators across all segments for this query, multiplied by 100. For example, a value of .95 is shown as 95. |
query_hash | bigint | Not implemented in this release. |
query_text | text | The SQL text of this query. |
query_plan | text | Text of the query plan. Not implemented in this release. |
application_name | varchar(64) | The name of the application. |
rsqname | varchar(64) | The name of the resource queue. |
rqppriority | varchar(16) | The priority of the query -- max, high, med, low, or min. |
segment_*
The segment_* tables contain memory allocation statistics for the Greenplum Database segment instances. This tracks the amount of memory consumed by all postgres processes of a particular segment instance, and the remaining amount of memory available to a segment as per the setting of the postgresql.conf configuration parameter: gp_vmem_protect_limit. Query processes that cause a segment to exceed this limit will be cancelled in order to prevent system-level out-of-memory errors. See the Greenplum Database Administrator Guide for more information about this parameter.
There are three segment tables, all having the same columns:
•segment_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current memory allocation data is stored in segment_now during the period between data collection from the Command Center agents and automatic commitment to the segment_history table.
•segment_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for memory allocation data that has been cleared from segment_now but has not yet been committed to segment_history. It typically only contains a few minutes worth of data.
•segment_history is a regular table that stores historical memory allocation metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
A particular segment instance is identified by its hostname and dbid (the unique segment identifier as per the gp_segment_configuration system catalog table).
Column | Type | Description |
ctime | timestamp(0) without time zone | Time this row was created. |
dbid | integer | The segment ID (dbid from gp_segment_configuration). |
hostname | charvar(64) | The segment hostname. |
dynamic_memory_used | bigint | The amount of dynamic memory (in bytes) allocated to query processes running on this segment. |
dynamic_memory_available | bigint | The amount of additional dynamic memory (in bytes) that the segment can request before reaching the limit set by the gp_vmem_protect_limit parameter. See also the views memory_info and dynamic_memory_info for aggregated memory allocation and utilization by host. |
socket_stats_*
The socket_stats_* tables store statistical metrics about socket usage for a Greenplum Database instance. There are three system tables, all having the same columns:
Note: These tables are in place for future use and are not currently populated.
•socket_stats_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data.
•socket_stats_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for socket statistical metrics that has been cleared from socket_stats_now but has not yet been committed to socket_stats_history. It typically only contains a few minutes worth of data.
•socket_stats_history is a regular table that stores historical socket statistical metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
total_sockets_used | int | Total sockets used in the system. |
tcp_sockets_inuse | int | Number of TCP sockets in use. |
tcp_sockets_orphan | int | Number of TCP sockets orphaned. |
tcp_sockets_timewait | int | Number of TCP sockets in Time-Wait. |
tcp_sockets_alloc | int | Number of TCP sockets allocated. |
tcp_sockets_memusage_inbytes | int | Amount of memory consumed by TCP sockets. |
udp_sockets_inuse | int | Number of UDP sockets in use. |
udp_sockets_memusage_inbytes | int | Amount of memory consumed by UDP sockets |
raw_sockets_inuse | int | Number of RAW sockets in use |
frag_sockets_inuse | int | Number of FRAG sockets in use. |
frag_sockets_memusage_inbytes | int | Amount of memory consumed by FRAG sockets. |
system_*
The system_* tables store system utilization metrics. There are three system tables, all having the same columns:
•system_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. Current system utilization data is stored in system_now during the period between data collection from the Command Center agents and automatic commitment to the system_history table.
•system_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for system utilization data that has been cleared from system_now but has not yet been committed to system_history. It typically only contains a few minutes worth of data.
•system_history is a regular table that stores historical system utilization metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
ctime | timestamp | Time this row was created. |
hostname | varchar (64) | Segment or master hostname associated with these system metrics. |
mem_total | bigint | Total system memory in Bytes for this host. |
mem_used | bigint | Used system memory in Bytes for this host. |
mem_actual_used | bigint | Used actual memory in Bytes for this host (not including the memory reserved for cache and buffers). |
mem_actual_free | bigint | Free actual memory in Bytes for this host (not including the memory reserved for cache and buffers). |
swap_total | bigint | Total swap space in Bytes for this host. |
swap_used | bigint | Used swap space in Bytes for this host. |
swap_page_in | bigint | Number of swap pages in. |
swap_page_out | bigint | Number of swap pages out. |
cpu_user | float | CPU usage by the Greenplum system user. |
cpu_sys | float | CPU usage for this host. |
cpu_idle | float | Idle CPU capacity at metric collection time. |
load0 | float | CPU load average for the prior one-minute period |
load1 | float | CPU load average for the prior five-minute period. |
load2 | float | CPU load average for the prior fifteen-minute period. |
quantum | int | Interval between metric collection for this metric entry. |
disk_ro_rate | bigint | Disk read operations per second. |
disk_wo_rate | bigint | Disk write operations per second. |
disk_rb_rate | bigint | Bytes per second for disk read operations. |
disk_wb_rate | bigint | Bytes per second for disk write operations. |
net_rp_rate | bigint | Packets per second on the system network for read operations. |
net_wp_rate | bigint | Packets per second on the system network for write operations. |
net_rb_rate | bigint | Bytes per second on the system network for read operations. |
net_wb_rate | bigint | Bytes per second on the system network for write operations. |
tcp_stats_*
The tcp_stats_* tables store statistical metrics about TCP communications for a Greenplum Database instance.
Note: These tables are in place for future use and are not currently populated.
There are three system tables, all having the same columns:
•tcp_stats_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data.
•tcp_stats_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for TCP statistical data that has been cleared from tcp_stats_now but has not yet been committed to tcp_stats_history. It typically only contains a few minutes worth of data.
•tcp_stats_history is a regular table that stores historical TCP statistical data. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
segments_received | bigint | Number of TCP segments received. |
segments_sent | bigint | Number of TCP segments sent. |
segments_retransmitted | bigint | Number of TCP segments retransmitted. |
bad_segments_received | int | Number of bad TCP segments received. |
active_connections | int | Number of active TCP connections. |
passive_connections | int | Number of passive TCP connections. |
failed_connection_attempts | int | Number of failed TCP connection attempts. |
connections_established | int | Number of TCP connections established. |
connection_resets_received | int | Number of TCP connection resets received. |
connection_resets_sent | int | Number of TCP connection resets sent. |
udp_stats_*
The udp_stats_* tables store statistical metrics about UDP communications for a Greenplum Database instance.
Note: These tables are in place for future use and are not currently populated.
There are three system tables, all having the same columns:
•udp_stats_now is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data.
•udp_stats_tail is an external table whose data files are stored in $MASTER_DATA_DIRECTORY/gpperfmon/data. This is a transitional table for UDP statistical data that has been cleared from udp_stats_now but has not yet been committed to udp_stats_history. It typically only contains a few minutes worth of data.
•udp_stats_history is a regular table that stores historical UDP statistical metrics. It is pre-partitioned into monthly partitions. Partitions are automatically added in two month increments as needed. Administrators must drop old partitions for the months that are no longer needed.
Column | Type | Description |
packets_received | bigint | Number of UDP packets received. |
packets_sent | bigint | Number of UDP packets sent. |
packets_received_unknown_port | int | Number of UDP packets received on unknown ports. |
packet_receive_errors | bigint | Number of errors encountered during UDP packet receive. |
iterators_*_rollup
The iterators_*_rollup set of views aggregate the metrics stored in the iterators_* tables. A query iterator refers to a node or operation in a query plan. For example, a sequential scan operation may be one type of iterator in a particular query plan. For each iterator in a query plan, the iterators_* tables store the metrics collected from each segment instance. The iterators_*_rollup views summarize the query iterator metrics across all segments in the system.
The tmid, ssid and ccnt columns are the composite key that uniquely identifies a particular query.
There are three iterators rollup views, all having the same columns:
•The iterators_now_rollup view shows iterator data from the interators_now table aggregated across all segments in the system.
•The iterators_tail_rollup view shows iterator data from the interators_tail table aggregated across all segments in the system.
•The iterators_history_rollup shows iterator data from the interators_history table aggregated across all segments in the system.
See also the iterators_* tables for more information about the query plan iterator types and the metrics collected for each iterator.
Column | Type | Description |
sample_time | timestamp | The ctime from the associated iterators_* table. |
tmid | int | A time identifier for a particular query. All iterator records associated with the query will have the same tmid. |
ssid | int | The session id as shown by the gp_session_id parameter. All iterator records associated with the query will have the same ssid. |
ccnt | int | The command number within this session as shown by gp_command_count parameter. All iterator records associated with the query will have the same ccnt. |
nid | int | The ID of this query plan node from the slice plan. |
pnid | int | The pnid (slice plan parent node ID) from the associated iterators_* table. |
ntype | text | The ntype (node/iterator type) from the associated iterators_* table. |
nstatus | text | The accumulated status of this iterator. Possible values are: Initialize, Executing, or Finished. |
tstart | timestamp | The average start time for this iterator. |
tduration | numeric | The average execution time for this iterator. |
pmemsize | numeric | The average work memory allocated by the Greenplum planner to this iterator’s query processes. |
pmemmax | numeric | The average of the maximum planner work memory used by this iterator’s query processes. |
memsize | numeric | The average OS memory allocated to this iterator’s processes. |
memresid | numeric | The average resident memory allocated to this iterator’s processes (as opposed to shared memory). |
memshare | numeric | The average shared memory allocated to this iterator’s processes. |
cpu_elapsed | numeric | Sum of the CPU usage of all segment processes executing this iterator. |
cpu_currpct | double precision | The current average percentage of CPU utilization used by this iterator’s processes. This value is always zero for historical (completed) iterators. rows_out numeric The total number of actual rows output for this iterator on all segments. |
rows_out_est | numeric | The total number of output rows for all segments as estimated by the query planner. |
skew_cpu | numeric | Coefficient of variation for cpu_elapsed of iterators across all segments for this query, multiplied by 100. For example, a value of .95 is rendered as 95. |
skew_rows | numeric | Coefficient of variation for rows_out of iterators across all segments for this query, multiplied by 100. For example, a value of .95 is rendered as 95. |
m0 | text | The name (m0_name), unit of measure (m0_unit), average actual value (m0_val), and average estimated value (m0_est) for this iterator metric across all segments. The m0 metric is always rows for all iterator types. |
m1 | text | The name (m1_name), unit of measure (m1_unit), average actual value (m1_val), and average estimated value (m1_est) for this iterator metric across all segments.
|
m2 | text | The name (m2_name), unit of measure (m2_unit), average actual value (m2_val), and average estimated value (m2_est) for this iterator metric across all segments. |
m3 | text | The name (m3_name), unit of measure (m3_unit), average actual value (m3_val), and average estimated value (m3_est) for this iterator metric across all segments. |
m4 | text | The name (m4_name), unit of measure (m4_unit), average actual value (m4_val), and average estimated value (m4_est) for this iterator metric across all segments. |
m5 | text | The name (m5_name), unit of measure (m5_unit), average actual value (m5_val), and average estimated value (m5_est) for this iterator metric across all segments. |
m6 | text | The name (m6_name), unit of measure (m6_unit), average actual value (m6_val), and average estimated value (m6_est) for this iterator metric across all segments. |
m7 | text | The name (m7_name), unit of measure (m7_unit), average actual value (m7_val), and average estimated value (m7_est) for this iterator metric across all segments. |
m8 | text | The name (m8_name), unit of measure (m8_unit), average actual value (m8_val), and average estimated value (m8_est) for this iterator metric across all segments. |
m9 | text | The name (m9_name), unit of measure (m9_unit), average actual value (m9_val), and average estimated value (m9_est) for this iterator metric across all segments. |
m10 - m15 | text | Metrics m10 through m15 are not currently used by any iterator types. |
t0 | text | The name of the relation (t0_val) being scanned by this iterator. This metric is collected only for iterators that perform scan operations such as a sequential scan or function scan. |
dynamic_memory_info
The dynamic_memory_info view shows a sum of the used and available dynamic memory for all segment instances on a segment host. Dynamic memory refers to the maximum amount of memory that Greenplum Database instance will allow the query processes of a single segment instance to consume before it starts cancelling processes. This limit is set by the gp_vmem_protect_limit server configuration parameter, and is evaluated on a per-segment basis.
Column | Type | Description |
ctime | timestamp(0) without time zone | Time this row was created in the segment_history table. |
hostname | varchar(64) | Segment or master hostname associated with these system memory metrics. |
dynamic_memory_used_mb | numeric | The amount of dynamic memory in MB allocated to query processes running on this segment. |
dynamic_memory_available_mb | numeric | The amount of additional dynamic memory (in MB) available to the query processes running on this segment host. Note that this value is a sum of the available memory for all segments on a host. Even though this value reports available memory, it is possible that one or more segments on the host have exceeded their memory limit as set by the gp_vmem_protect_limit parameter. |