A simple shell script to redistribute, index and analyze table in Greenplum

posted Apr 28, 2017, 10:25 AM by Sachchida Ojha
[gpadmin@sachi reports]$ cat a.sh
#!/bin/bash
set -e
set -u
dbconnect="psql -X --set ON_ERROR_STOP=on --set AUTOCOMMIT=off sachi"

$dbconnect<<SQL
\timing on
\qecho Altering the distribution keys .. Please wait
ALTER TABLE abc set DISTRIBUTED by (id);
\qecho
\qecho Creating Indexes on the tables .. Please wait
create index idx_abc_id on abc(id);
\qecho
\qecho Analyzing  tables .. Please wait
ANALYZE abc;
SQL
For example

[gpadmin@sachi]$ ./a.sh
Timing is on.
Altering the distribution keys .. Please wait
WARNING:  distribution policy of relation "abc" already set to (id)
HINT:  Use ALTER TABLE "abc" SET WITH (REORGANIZE=TRUE) DISTRIBUTED BY (id) to force redistribution
ALTER TABLE
Time: 3.435 ms

Creating Indexes on the tables .. Please wait
CREATE INDEX
Time: 155.207 ms

Analyzing tables .. Please wait
ANALYZE
Time: 55.675 ms
[gpadmin@sachi]$ 
You may want to display some more information about distribution keys before and after. Below is second version of the above script.


[gpadmin@sachi]$ cat a.sh
#!/bin/bash
set -e
set -u

dbconnect="psql -X --set ON_ERROR_STOP=on --set AUTOCOMMIT=off sachi"

$dbconnect<<SQL

\timing on
SELECT 'You are connected to database '|| current_database()||' as user '||current_user as Connection_details;

\qecho Distribution keys before update.....
select table_owner,table_name, COALESCE(distribution_keys,'DISTRIBUTED RANDOMLY') distribution_keys
 from
(
SELECT pgn.nspname as table_owner,
pgc.relname as table_name,
pga.attname as distribution_keys
FROM (SELECT gdp.localoid,
CASE
WHEN ( Array_upper(gdp.attrnums, 1) > 0 ) THEN
Unnest(gdp.attrnums)
ELSE NULL
END AS attnum
FROM gp_distribution_policy gdp
ORDER BY gdp.localoid) AS distrokey
INNER JOIN pg_class AS pgc
ON distrokey.localoid = pgc.oid
INNER JOIN pg_namespace pgn
ON pgc.relnamespace = pgn.oid
LEFT OUTER JOIN pg_attribute pga
ON distrokey.attnum = pga.attnum
AND distrokey.localoid = pga.attrelid
ORDER BY pgn.nspname,
pgc.relname) as a
where upper(a.table_owner)='PUBLIC'
and lower(a.table_name) in
('abc');

\qecho Altering the distribution keys .. Please wait
ALTER TABLE abc set DISTRIBUTED by (id);
\qecho
SELECT 'You are connected to database '|| current_database()||' as user '||current_user as Connection_details;

\qecho Distribution keys before update.....
select table_owner,table_name, COALESCE(distribution_keys,'DISTRIBUTED RANDOMLY') distribution_keys
 from
(
SELECT pgn.nspname as table_owner,
pgc.relname as table_name,
pga.attname as distribution_keys
FROM (SELECT gdp.localoid,
CASE
WHEN ( Array_upper(gdp.attrnums, 1) > 0 ) THEN
Unnest(gdp.attrnums)
ELSE NULL
END AS attnum
FROM gp_distribution_policy gdp
ORDER BY gdp.localoid) AS distrokey
INNER JOIN pg_class AS pgc
ON distrokey.localoid = pgc.oid
INNER JOIN pg_namespace pgn
ON pgc.relnamespace = pgn.oid
LEFT OUTER JOIN pg_attribute pga
ON distrokey.attnum = pga.attnum
AND distrokey.localoid = pga.attrelid
ORDER BY pgn.nspname,
pgc.relname) as a
where upper(a.table_owner)='PUBLIC'
and lower(a.table_name) in
('abc');

\qecho Creating Indexes on the tables .. Please wait
create index idx_abc_id on abc(id);
\qecho
\qecho Analyzing  tables .. Please wait
ANALYZE abc;

SQL
[gpadmin@sachi ]$ 

[gpadmin@sachi ]$ ./a.sh
Timing is on.
                 connection_details                  
-----------------------------------------------------
 You are connected to database sachi as user gpadmin
(1 row)

Time: 2.217 ms
Distribution keys before update.....
 table_owner | table_name | distribution_keys 
-------------+------------+-------------------
 public      | abc        | id
(1 row)

Time: 12.483 ms
Altering the distribution keys .. Please wait
WARNING:  distribution policy of relation "abc" already set to (id)
HINT:  Use ALTER TABLE "abc" SET WITH (REORGANIZE=TRUE) DISTRIBUTED BY (id) to force redistribution
ALTER TABLE
Time: 1.566 ms

                 connection_details                  
-----------------------------------------------------
 You are connected to database sachi as user gpadmin
(1 row)

Time: 0.743 ms
Distribution keys before update.....
 table_owner | table_name | distribution_keys 
-------------+------------+-------------------
 public      | abc        | id
(1 row)

Time: 5.513 ms
Creating Indexes on the tables .. Please wait
CREATE INDEX
Time: 141.049 ms

Analyzing tables .. Please wait
ANALYZE
Time: 54.354 ms
[gpadmin@sachi ]$ 
Comments