Verifying Your Partition Strategy in Greenplum

The purpose for partitioning a table is to reduce the number of rows that must be scanned in order to satisfy a given query. If a table is partitioned based on the query predicate, you can verify that the query planner is selectively scanning the relevant data by using EXPLAIN to look at the query plan.

For example, suppose we have a sales table that is date-range partitioned by month and subpartitioned by region.

For the following query:

EXPLAIN SELECT * FROM sales WHERE date='01-07-08' AND region='usa';

The query plan for this query should show a table scan of the following tables only:

1. the default partition returning 0-1 rows (if your partition design has one)
2. the January 2008 partition (sales_1_prt_1) returning 0-1 rows
3. the USA region subpartition (sales_1_2_prt_usa) returning some number of rows.

Below is an example of the relevant portion of the query plan:
-> Seq Scan on sales_1_prt_1 sales (cost=0.00..0.00 rows=0
width=0)
Filter: "date"=01-07-08::date AND region='USA'::text
-> Seq Scan on sales_1_2_prt_usa sales (cost=0.00..9.87 rows=20 width=40)

Make sure that the query planner is not scanning unnecessary partitions or subpartitions (for example, scans of other months or regions not specified in the query predicate), and that scans of the top-level tables are returning 0-1 rows.

Some Limitations of Selective Partition Scanning

If the query plan shows that your partition hierarchy is not being selectively scanned, it may be due to one of the following limitations:

1. The query planner is only able to selectively scan partitioned tables when the query contains a direct and simple restriction of the table using immutable operators such as:

= < <= > >= <>

2. Selective scanning does not currently recognize VOLATILE functions within a query (STABLE and IMMUTABLE functions are recognized). For example, WHERE clauses such as date > CURRENT_DATE will cause the query planner to selectively scan partitioned tables, but time > TIMEOFDAY will not.

3. Selective scanning does not currently support the runtime evaluation of table data as a basis for determining which partitions to eliminate. For example, a query with a WHERE predicate such as the following would not cause selective scanning of partitions because the evaluation criteria is based on the runtime evaluation of the value in some_column:

SELECT * from partition_table PT, other_table OT WHERE PT.id=OT.id and OT.some_column = 'value';

Viewing Your Partition Design

You can look up information about your partition design using the pg_partitions view. For example to see the partition design of the sales table:
SELECT partitionboundary, partitiontablename, partitionname, partitionlevel, partitionrank FROM pg_partitions WHERE tablename='sales';
There are also the following views that show information about partitioned tables:

1. pg_partition_templates - Shows subpartitions that were created using a subpartition template.
2. pg_partition_columns - Shows the partition key columns used in a partition design.


Comments