How to alter table distribution policy in greenplum?

posted Apr 28, 2017, 4:33 PM by Sachchida Ojha
Altering Table Distribution
ALTER TABLE provides options to change the distribution policy of a table. When the distribution options of a table change, the table data is redistributed on disk, which can be resource intensive. There is also an option to redistribute table data using the existing distribution policy.

Changing the Distribution Policy
You can use the ALTER TABLE command to change the distribution policy for a table. For partitioned tables, changes to the distribution policy recursively apply to the child partitions. This operation preserves the ownership and all other attributes of the table. For example, the following command redistributes the table sales across all segments using the customer_id column as the distribution key:
ALTER TABLE sales SET DISTRIBUTED BY (customer_id);

When you change the hash distribution of a table, table data is automatically redistributed. However, changing the distribution policy to a random distribution will not cause the data to be redistributed. For example:

ALTER TABLE sales SET DISTRIBUTED RANDOMLY;

Redistributing Table Data
To redistribute table data for tables with a random distribution policy (or when the hash distribution policy has not changed) use REORGANIZE=TRUE. This sometimes may be necessary to correct a data skew problem, or when new segment resources have been added to the system. For example:
ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
This command rebalances a table evenly across all segments using the current distribution policy (including random distribution).
Comments