Greenplum DBA Quick Reference

GreenplumDBA.com website will no longer be available on public domain. Email us at gpdba@greenplumdba.com for further details.  Become a registered member of DBAref.com and you will get full access to all of our blog posts, white papers, Greenplum Database eBooks and much more.

Understanding GPDB BLOAT on heap, AO, AO Columnar tables by examples

posted Apr 22, 2022, 3:08 PM by Sachchida Ojha

Understanding Greenplum Bloat by Examples    

Step 1: Create heap, ao and ao columnar (column oriented) table 

drop table IF EXISTS sachi.vac_test_heap; 

drop table IF EXISTS sachi.vac_test_ao

drop table IF EXISTS sachi.vac_test_aoco

create table 

sachi.vac_test_heap( 

a int,  

b text)  

distributed by (a)

create table 

sachi.vac_test_ao( 

a int,  

b text 

)  

with (appendonly=true)  

distributed by (a)

create table 

sachi.vac_test_aoco( 

a int,  

b text 

)  

with (appendonly=true, orientation=column)  

distributed by (a)

  

DROP TABLE 

DROP TABLE 

DROP TABLE 

CREATE TABLE 

CREATE TABLE 

CREATE TABLE 

  

  

select 

c.oid, 

c.relname, 

n.nspname, 

c.relkind, 

c.reltuples, 

c.relstorage 

 from pg_class c  

 JOIN pg_namespace n ON c.relnamespace = n.oid  

 where n.nspname='sachi' 

 and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | relname | nspname | relkind | reltuples | relstorage ----------+---------------+---------+---------+-----------+------------ 19807184 | vac_test_heap | sachi | r | 0 | h 19807190 | vac_test_ao | sachi | r | 0 | a 19807200 | vac_test_aoco | sachi | r | 0 | c (3 rows)

  

  

SELECT 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows FROM pg_appendonly a 

JOIN pg_class c ON c.oid = a.relid 

JOIN pg_namespace n ON c.relnamespace = n.oid 

WHERE c.relkind = 'r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

schema_name | table_name | num_rows | ao_num_rows 

-------------+------------+----------+------------- 

(0 rows) 

  

  

insert into vac_test_heap values (generate_series(1,10000000),'this is a test'); insert into vac_test_ao values (generate_series(1,10000000),'this is a test'); insert into vac_test_aoco values (generate_series(1,10000000),'this is a test')

INSERT 0 10000000 

INSERT 0 10000000 

INSERT 0 10000000 

  

select c.oid,c.relname,n.nspname,c.relkind,c.relpages, c.reltuples,c.relstorage from pg_class c 

JOIN pg_namespace n ON c.relnamespace = n.oid 

where n.nspname='sachi' and c.relname in 

('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | relname | nspname | relkind | relpages | reltuples | relstorage ----------+---------------+---------+---------+----------+-----------+------------ 19807184 | vac_test_heap | sachi | r | 0 | 0 | h 19807190 | vac_test_ao | sachi | r | 0 | 0 | a 19807200 | vac_test_aoco | sachi | r | 0 | 0 | c (3 rows) 

SELECT n.nspname AS schema_name,  

 c.relname AS table_name,  

 c.reltuples AS num_rows, 

 (gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows  FROM pg_appendonly a 

 JOIN pg_class c ON c.oid = a.relid 

 JOIN pg_namespace n ON c.relnamespace = n.oid 

 WHERE c.relkind = 'r' 

 and n.nspname='sachi' 

 and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

schema_name | table_name | num_rows | ao_num_rows 

-------------+---------------+----------+------------- 

sachi | vac_test_ao | 0 | 10000000 

sachi | vac_test_aoco | 0 | 10000000 

(2 rows) 

 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+---------- +-------- 

19807190 | sachi | vac_test_ao | 0 | 10000000 | 0.00 | 10000000  | 0 

19807200 | sachi | vac_test_aoco | 0 | 10000000 | 0.00 | 10000000  | 0 

(2 rows) 

  

-- Now run analyze on each table. 

  

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

  

ANALYZE 

ANALYZE 

ANALYZE  

  

select c.oid,c.relname,n.nspname,c.relkind,c.relpages,  

c.reltuples::bigint,c.relstorage 

from pg_class c 

JOIN pg_namespace n ON c.relnamespace = n.oid 

where n.nspname='sachi' and c.relname in 

('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | relname | nspname | relkind | relpages | reltuples | relstorage ----------+---------------+---------+---------+----------+-----------+------------ 19807184 | vac_test_heap | sachi | r | 16000 | 10011500 | h 19807190 | vac_test_ao | sachi | r | 10496 | 10000000 | a 19807200 | vac_test_aoco | sachi | r | 5888 | 10000000 | c (3 rows) 

 

select count(*) from sachi.vac_test_heap

count 

---------- 

10000000 

(1 row) 

  

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+---------- +-------- 

19807190 | sachi | vac_test_ao | 10000000 | 10000000 | 0.00 | 10000000  | 0 

19807200 | sachi | vac_test_aoco | 10000000 | 10000000 | 0.00 | 10000000  | 0 

(2 rows) 

insert into vac_test_heap values (generate_series(1,500000),'this is a new 5 percent  rows test')

insert into vac_test_ao values (generate_series(1,500000),'this is a new 5 percent  rows test')

insert into vac_test_aoco values (generate_series(1,500000),'this is a new 5 percent  rows test')

INSERT 0 500000 

INSERT 0 500000 

INSERT 0 500000 

 

select c.oid,c.relname,n.nspname,c.relkind,c.relpages,  

c.reltuples::bigint,c.relstorage 

from pg_class c 

JOIN pg_namespace n ON c.relnamespace = n.oid 

where n.nspname='sachi' and c.relname in 

('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | relname | nspname | relkind | relpages | reltuples | relstorage ----------+---------------+---------+---------+----------+-----------+------------ 19807184 | vac_test_heap | sachi | r | 16000 | 10011500 | h 19807190 | vac_test_ao | sachi | r | 10496 | 10000000 | a 19807200 | vac_test_aoco | sachi | r | 5888 | 10000000 | c (3 rows) 

  

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco');   

oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+---------- +-------- 

19807190 | sachi | vac_test_ao | 10000000 | 10500000 | 0.00 | 10500000 | 0 

19807200 | sachi | vac_test_aoco | 10000000 | 10500000 | 0.00 | 10500000 | 0 

(2 rows) 

  

select 

oid, 

schema_name, 

table_name, 

num_rows, 

ao_num_rows, 

(sum(ao_num_rows)/num_rows) as pct 

from 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows,

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

AND c.reltuples >

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco'

--order by 4 desc limit 50000 

as sub 

GROUP BY 1,2,3,4,

HAVING sum(ao_num_rows) > num_rows * (1::numeric + (5::numeric/100::numeric)) order by 4 desc

oid | schema_name | table_name | num_rows | ao_num_rows | pct 

-----+-------------+------------+----------+-------------+----- 

(0 rows) 

insert into vac_test_heap values (generate_series(1,5000),'this is a new 1 percent  rows test')

insert into vac_test_ao values (generate_series(1,5000),'this is a new 1 percent rows  testt')

insert into vac_test_aoco values (generate_series(1,5000),'this is a new 1 percent  rows test')

INSERT 0 5000 

INSERT 0 5000 

INSERT 0 5000 

select 

oid, 

schema_name, 

table_name, 

num_rows, 

ao_num_rows, 

(sum(ao_num_rows)/num_rows) as pct 

from 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

AND c.reltuples >

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco'

--order by 4 desc limit 50000 

as sub 

GROUP BY 1,2,3,4,5

HAVING sum(ao_num_rows) > num_rows * (1::numeric + (5::numeric/100::numeric)) order by 4 desc

 oid | schema_name | table_name | num_rows | ao_num_rows | pct ----------+-------------+---------------+----------+-------------+------------------- - 

19807190 | sachi | vac_test_ao | 10000000 | 10505000 | 1.0505000000000000 19807200 | sachi | vac_test_aoco | 10000000 | 10505000 | 1.0505000000000000 (2 rows) 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

 oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup  | hidtup 

----------+-------------+---------------+----------+-------------+--------+---------- +-------- 

19807190 | sachi | vac_test_ao | 10000000 | 10505000 | 0.00 | 10505000  | 0 

19807200 | sachi | vac_test_aoco | 10000000 | 10505000 | 0.00 | 10505000  | 0 

(2 rows) 

  

insert into vac_test_heap values (generate_series(1,10000000),'this is a test'); insert into vac_test_ao values (generate_series(1,10000000),'this is a test'); insert into vac_test_aoco values (generate_series(1,10000000),'this is a test')

INSERT 0 10000000 

INSERT 0 10000000 

INSERT 0 10000000 

  

select c.oid,c.relname,n.nspname,c.relkind,c.relpages,  

c.reltuples::bigint,c.relstorage 

from pg_class c 

JOIN pg_namespace n ON c.relnamespace = n.oid 

where n.nspname='sachi' and c.relname in 

('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | relname | nspname | relkind | relpages | reltuples | relstorage ----------+---------------+---------+---------+----------+-----------+------------ 19807184 | vac_test_heap | sachi | r | 16000 | 10011500 | h 19807190 | vac_test_ao | sachi | r | 10496 | 10000000 | a

19807200 | vac_test_aoco | sachi | r | 5888 | 10000000 | c (3 rows) 

  

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

select c.oid,c.relname,n.nspname,c.relkind,c.relpages,  

c.reltuples::bigint,c.relstorage 

from pg_class c 

JOIN pg_namespace n ON c.relnamespace = n.oid 

where n.nspname='sachi' and c.relname in 

('vac_test_heap','vac_test_ao','vac_test_aoco')

  

delete from sachi.vac_test_heap where a<800000

delete from sachi.vac_test_ao where a<800000

delete from sachi.vac_test_aoco where a<800000

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

DELETE 14400000 

DELETE 14400000 

DELETE 14400000 

ANALYZE 

ANALYZE 

ANALYZE 

  

 ----no Bloat 

  

select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) | Comments -------------+------------------+----------------+-----------------+---------- (0 rows) 

  

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) | Comments -------------+------------------+----------------+-----------------+---------- (0 rows) 

 

delete from sachi.vac_test_heap where a<8000000

delete from sachi.vac_test_ao where a<8000000

delete from sachi.vac_test_aoco where a<8000000

  

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

  

-- Now bloat diag views shows the bloat. AO and AOCO Table does not from this view. select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) |  Comments 

-------------+------------------+----------------+-----------------+----------------- ------------------- 

sachi | vac_test_heap | 32896 | 5086 | moderate amount  of bloat suspected 

(1 row) 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

 oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+---------- +-------- 

19807190 | sachi | vac_test_ao | 4000002 | 20505000 | 80.49 | 20505000  | 128947 

19807200 | sachi | vac_test_aoco | 4000002 | 20505000 | 80.49 | 20505000  | 128947 

(2 rows) 

  

vacuum sachi.vac_test_heap

vacuum sachi.vac_test_ao

vacuum sachi.vac_test_aoco;

VACUUM 

VACUUM 

VACUUM 

  

 -- Still show bloat on heap tables.  

select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) |  Comments 

-------------+------------------+----------------+-----------------+----------------- ------------------- 

sachi | vac_test_heap | 32896 | 5372 | moderate amount  of bloat suspected 

(1 row) 

-- Hidden tuple is gone but see additional rows. 

 SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS 

ao_num_rows, 

(SELECT max(percent_hidden) FROM 

gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM 

gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM 

gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

 and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

 oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+--------- +-------- 

19807190 | sachi | vac_test_ao | 4000002 | 0 | 0.00 | 4000002 |  0 

19807190 | sachi | vac_test_ao | 4000002 | 4000002 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 0 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 4000002 | 0.00 | 4000002 |  0 

(4 rows) 

  

 

vacuum full sachi.vac_test_heap

vacuum full sachi.vac_test_ao

vacuum full sachi.vac_test_aoco

VACUUM 

VACUUM 

VACUUM 

  

select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) | Comments -------------+------------------+----------------+-----------------+---------- (0 rows) 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+--------- +-------- 

19807190 | sachi | vac_test_ao | 4000002 | 0 | 0.00 | 4000002 |  0 

19807190 | sachi | vac_test_ao | 4000002 | 4000002 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 0 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 4000002 | 0.00 | 4000002 |  0 

(4 rows) 

  

analyze sachi.vac_test_heap

analyze sachi.vac_test_ao

analyze sachi.vac_test_aoco

ANALYZE

ANALYZE 

ANALYZE 

  

 select bdinspname as "Schema Name"

bdirelname as "Table/Index Name"

bdirelpages as "Pages(current)",bdiexppages as "Pages(Expected)", bdidiag as "Comments" 

from gp_toolkit.gp_bloat_diag where bdinspname='sachi' and bdirelname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

Schema Name | Table/Index Name | Pages(current) | Pages(Expected) | Comments -------------+------------------+----------------+-----------------+---------- (0 rows) 

SELECT 

c.oid, 

n.nspname AS schema_name, 

c.relname AS table_name, 

c.reltuples::bigint AS num_rows, 

(gp_toolkit.__gp_aovisimap_hidden_info(c.oid)).total_tupcount AS ao_num_rows, (SELECT max(percent_hidden) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as perhid, 

(SELECT sum(total_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid)) as tottup, 

(SELECT max(hidden_tupcount) FROM gp_toolkit.__gp_aovisimap_compaction_info(c.oid))  as hidtup 

FROM pg_appendonly a 

JOIN pg_class c ON c.oid=a.relid 

JOIN pg_namespace n ON c.relnamespace=n.oid 

WHERE c.relkind='r' 

and n.nspname='sachi' 

and c.relname in ('vac_test_heap','vac_test_ao','vac_test_aoco')

 oid | schema_name | table_name | num_rows | ao_num_rows | perhid | tottup |  hidtup 

----------+-------------+---------------+----------+-------------+--------+--------- +-------- 

19807190 | sachi | vac_test_ao | 4000002 | 0 | 0.00 | 4000002 |  0 

19807190 | sachi | vac_test_ao | 4000002 | 4000002 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 0 | 0.00 | 4000002 |  0 

19807200 | sachi | vac_test_aoco | 4000002 | 4000002 | 0.00 | 4000002 |  0 

(4 rows) 

  

  

Get more information 

  

SELECT * FROM gp_toolkit.__gp_aovisimap_compaction_info(19807190); SELECT * FROM gp_toolkit.__gp_aovisimap_compaction_info(19807200); SELECT * FROM gp_toolkit.__gp_aovisimap_hidden_info(19807200)

SELECT * FROM gp_toolkit.__gp_aovisimap_hidden_info(19807190);


Fixing gp_bloat_diag view - showing bloated tables even there are no bloats on those tables

posted Aug 30, 2018, 11:12 AM by Sachchida Ojha

Recently we found that gp_toolkit.gp_bloat_diag view shows some bloated tables which were recently reorg (vacuum full) and analyzed .

Problem was due to incorrect data reported by gp_toolkit.gp_bloat_expected_pages called by gp_toolkit.gp_bloat_diag.  Fix is to recreate gp_toolkit.gp_bloat_expected_pages using following SQL. This fix came from Pivotal support and will be part of next release.

create or replace view  gp_toolkit.gp_bloat_expected_pages as  SELECT subq.btdrelid, subq.btdrelpages,
        CASE
            WHEN subq.btdexppages < subq.numsegments::numeric THEN subq.numsegments::numeric
            ELSE subq.btdexppages
        END AS btdexppages
   FROM ( SELECT pgc.oid AS btdrelid, pgc.relpages AS btdrelpages, ceil((pgc.reltuples * (25::double precision + btwcols.width))::numeric / current_setting('block_size'::text)::numeric) AS btdexppages, ( SELECT __gp_number_of_segments.numsegments
                   FROM gp_toolkit.__gp_number_of_segments) AS numsegments
           FROM ( SELECT pgc.oid, pgc.reltuples, pgc.relpages
                   FROM pg_class pgc
                  WHERE NOT EXISTS ( SELECT __gp_is_append_only.iaooid
                           FROM gp_toolkit.__gp_is_append_only
                          WHERE __gp_is_append_only.iaooid = pgc.oid AND __gp_is_append_only.iaotype = true)
                 AND NOT EXISTS
                (
                    SELECT parrelid
                    FROM pg_partition
                    WHERE parrelid = pgc.oid
                )
                          ) pgc
      LEFT JOIN ( SELECT pgs.starelid, sum(pgs.stawidth::double precision * (1.0::double precision - pgs.stanullfrac)) AS width
                   FROM pg_statistic pgs
                  GROUP BY pgs.starelid) btwcols ON pgc.oid = btwcols.starelid
     WHERE btwcols.starelid IS NOT NULL) subq;
 

Dynamic UNIX script to change the owner of tables in Greenplum

posted Oct 25, 2014, 4:57 PM by Sachchida Ojha   [ updated Oct 26, 2014, 6:34 AM ]

-- Change owner of all tables of a given schema

for tbl in `psql -qAt -c "select tablename from pg_tables where schemaname = 'public';" sachi` ; do  psql -c "alter table $tbl owner to new_user"  sachi; done

-- list all tables of a schema

for tbl in `psql -qAt -c "select tablename from pg_tables where schemaname = 'prod';" sachi` ; do  echo $tbl; done

PGOPTIONS='--client-min-messages=warning' psql -X -q -a -1 -v ON_ERROR_STOP=1 --pset pager=off -d mydb -f script.sql

Checking bloats with customized scripts

posted Oct 23, 2014, 6:08 PM by Sachchida Ojha   [ updated Oct 23, 2014, 6:08 PM ]

gp_bloat_diag view helps to identify tables that need routine table maintenance (VACUUM and/or ANALYZE).

See also : Creating a test case for bloat and understanding how it works in Greenplum


gpadmin-# \d gp_toolkit.gp_bloat_diag
View "gp_toolkit.gp_bloat_diag"
Column | Type | Modifiers
-------------+---------+-----------
bdirelid | oid |
bdinspname | name |
bdirelname | name |
bdirelpages | integer |
bdiexppages | numeric |
bdidiag | text |
View definition:
SELECT bloatsummary.btdrelid AS bdirelid, bloatsummary.fnnspname AS bdinspname, bloatsummary.fnrelname AS bdirelname, bloatsummary.btdrelpages AS bdirelpages, bloatsummary.btdexppages AS bdiexppages, (bloatsummary.bd).bltdiag AS bdidiag
FROM ( SELECT fn.fnoid, fn.fnnspname, fn.fnrelname, beg.btdrelid, beg.btdrelpages, beg.btdexppages, gp_toolkit.gp_bloat_diag(beg.btdrelpages, beg.btdexppages::integer, iao.iaotype) AS bd
FROM gp_toolkit.gp_bloat_expected_pages beg, pg_class pgc, gp_toolkit.__gp_fullname fn, gp_toolkit.__gp_is_append_only iao
WHERE beg.btdrelid = pgc.oid AND pgc.oid = fn.fnoid AND iao.iaooid = pgc.oid) bloatsummary
WHERE (bloatsummary.bd).bltidx > 0;

bdirelid=>Table object id.
bdinspname=>Schema name.
bdirelname=>Table name.
bdirelpages=>Actual number of pages on disk.
bdiexppages=>Expected number of pages given the table data.
bdidiag=>Bloat diagnostic message.

gpadmin=# select * from gp_toolkit.gp_bloat_diag;
bdirelid | bdinspname | bdirelname | bdirelpages | bdiexppages | bdidiag
----------+------------+------------+-------------+-------------+---------
(0 rows)

This view shows tables that have bloat (the actual number of pages on disk exceeds the expected number of pages given the table statistics). Tables that are bloated require a VACUUM or a VACUUM FULL in order to reclaim disk space occupied by deleted or obsolete rows. This view is accessible to all users, however non-superusers will only be able to see the tables that they have permission to access.


You can also use following SQL  to check the bloat.

select

current_database() as dbname
, relid
, schemaname
, tablename
, round(bloat,1) as bloat_ratio
, pg_size_pretty(expbytes) as expected_size
, pg_size_pretty(relbytes) as relation_size
, pg_size_pretty(wastedbytes) as wasted_space
, round(1.0 * relbytes / nullif((sum(relbytes) over ())::bigint,0),4) as relative_size
, round((bloat-1) * (1.0 * relbytes / nullif((sum(relbytes) over ())::bigint, 0)), 8) as concern
from
(
SELECT
relid
, schemaname
, tablename
, CASE WHEN otta=0 THEN 
0.0 
ELSE 
sml.relpages/otta::numeric 
END AS bloat
, reltuples::bigint
, relpages::bigint
, otta
, (bs*otta)::bigint as expbytes
, CASE WHEN relpages < otta THEN 0 ELSE 
(bs*(sml.relpages-otta))::bigint 
END AS wastedbytes
, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages
, (bs*relpages)::bigint as relbytes
FROM
(
SELECT
schemaname
, tablename
, cc.oid as relid
, cc.reltuples
, cc.relpages
, bs
, CEIL((cc.reltuples*((datahdr+ma-(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta 
FROM 
(
SELECT
ma
, bs
, schemaname
, tablename
, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr
, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM
(
SELECT
schemaname
, tablename
, hdr
, ma
, bs
, SUM((1-s.null_frac)*avg_width) AS datawidth
, MAX(s.null_frac) AS maxfracsum
, hdr+(1+(count(case when s.null_frac<>0 then 1 end))/8) as nullhdr
FROM 
pg_stats s
cross join 
(
SELECT
current_setting('block_size')::numeric AS bs
, CASE WHEN substring(version(),12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr
, CASE WHEN version() ~ 'mingw32' THEN 8 ELSE 4 END AS ma
) AS constants
GROUP BY schemaname, tablename, hdr, ma, bs
) AS foo
) AS rs
JOIN pg_class cc 
ON
cc.relname = rs.tablename 
and cc.relkind = 'r'
JOIN pg_namespace nn 
ON
cc.relnamespace = nn.oid 
AND nn.nspname = rs.schemaname 
) AS sml
) wrapper
where wastedbytes > 2*1024*1024 and bloat >= 1.4order by (bloat-1) * (1.0 * relbytes / nullif((sum(relbytes) over ())::bigint, 0)) desc;

Greenplum gpconfig command to get and set GUC parameters

posted Oct 11, 2014, 8:42 AM by Sachchida Ojha

[gpadmin@sachi ~]$ gpconfig -?
COMMAND NAME: gpconfig

Sets server configuration parameters on all segments within 
a Greenplum Database system.

*****************************************************
SYNOPSIS
*****************************************************

gpconfig -c <param_name> -v <value> [-m <master_value> | --masteronly]
       | -r <param_name> [--masteronly]
       | -l 
   [--skipvalidation] [--verbose] [--debug]

gpconfig -s <param_name> [--verbose] [--debug]

gpconfig --help


*****************************************************
DESCRIPTION
*****************************************************

The gpconfig utility allows you to set, unset, or view configuration parameters from the postgresql.conf files of all instances (master, segments, and mirrors) in your Greenplum Database system. When setting a parameter, you can also specify a different value for the master if necessary. For example, parameters such as max_connections require 
a different setting on the master than what is used for the segments. 

If you want to set or unset a global or master only parameter, use the --masteronly option.

gpconfig can only be used to manage certain parameters. For example, you cannot use it to set parameters such as port, which is required to be distinct for every segment instance. Use the -l (list) option to see a complete list of configuration parameters supported by gpconfig.

When gpconfig sets a configuration parameter in a segment postgresql.conf file, the new parameter setting always displays at the bottom of the file. When you use gpconfig to remove a configuration parameter setting, gpconfig comments out the parameter in all segment postgresql.conf files, thereby restoring the system default setting. 
For example, if you use gpconfig to remove (comment out) a parameter and later add it back (set a new value), there will be two instances of the parameter; one that is commented out, and one that is enabled and inserted at the bottom of the postgresql.conf file.

After setting a parameter, you must restart your Greenplum Database system or reload the postgresql.conf files in order for the change to take effect. Whether you require a restart or a reload depends on the parameter. See the Server Configuration Parameters reference for more information about the server configuration parameters.

To show the currenly set values for a parameter across the system, 
use the -s option.

gpconfig uses the following environment variables to connect to the Greenplum Database master instance and obtain system configuration information: 
  * PGHOST
  * PGPORT
  * PGUSER
  * PGPASSWORD
  * PGDATABASE


*****************************************************
DESCRIPTION
*****************************************************

-c | --change <param_name>

Changes a configuration parameter setting by adding the new setting to the bottom of the postgresql.conf files.

-v | --value value

The value to use for the configuration parameter you specified with the -c option. By default, this value is applied to all segments, their mirrors, the master, and the standby master.

-m | --mastervalue master_value

The master value to use for the configuration parameter you specified with the -c option. If specified, this value only applies to the master and standby master. The option can only be used with -v. 

--masteronly

When specified, gpconfig only edits the master postgresql.conf file.

-r | --remove <param_name>

Specifies the configuration parameter name to unset or remove by commenting out the entry in the postgresql.conf files.

-l | --list

Lists all configuration parameters supported by the gpconfig utility.

-s | --show <param_name>

Shows the value for a configuration parameter used on all instances (master and segments) in the Greenplum Database system. If there is a discrepancy in a parameter value between segment instances, the gpconfig utility displays an error message.

Note that the gpconfig utility reads parameter values directly from the database, and not the postgresql.conf file. If you are using gpconfig to set configuration parameters across all segments, then running gpconfig -s to verify the changes, you might still see the previous (old) values. You must reload the configuration files (gpstop -u) or restart the system (gpstop -r) for changes to take effect.

--skipvalidation

Overrides the system validation checks of gpconfig and allows you to operate on any server configuration parameter, including hidden parameters and restricted parameters that cannot be changed by gpconfig. When used with the -l option (list), it shows the list of restricted parameters. This option should only be used to set parameters when directed by Greenplum Customer Support.

--verbose 

Displays additional log information during gpconfig command execution.

--debug

Sets logging output to debug level. 

-? | -h | --help

Displays the online help.

*****************************************************
EXAMPLES
*****************************************************

Set the work_mem setting to 120MB on the master only:

gpconfig -c work_mem -v 120MB --masteronly

Set the max_connections setting to 100 on all segments and 10 on the master:

gpconfig -c max_connections -v 100 -m 10

Comment out all instances of the default_statistics_target configuration parameter, and restore the system default:

gpconfig -r default_statistics_target

List all configuration parameters supported by gpconfig:

gpconfig -l

Show the values of a particular configuration parameter across the system:

gpconfig -s max_connections

*****************************************************
SEE ALSO
*****************************************************

gpstop

Troubleshoot and fix gpfdist process dying abruptly when multiple gpload session starts from Informatica power center

posted Oct 11, 2014, 8:27 AM by Sachchida Ojha

Debugging little more and looking at the log we found that Informatica is launching 3 gpload in parallel and each gpload process is launching have 10 gpfdist processes. The gpload processes and their associated gpfdist processes will all run on the same server that happens to be running the PowerCenter job. Informatica writes to a named pipe and gpfdist reads from the named pipe. We suspect the 30 gpload processes are stressing the kernel or network buffers/devices which is causing the connection fault. Typically when loading data with this many processes we would spread the gpfdist process across multiple servers. We may only need to tweak the kernel OS params but we need the logs to know for sure.

To get the complete debugging click here

Greenplum database maintenance and monitoring - do it yourself script

posted Oct 9, 2014, 8:07 AM by Sachchida Ojha

A custom designed, must have  script for all greenplum DBA. To get the script email us gpdba@greenplumdba.com

==Greenplum database maintenance and monitoring Option ==
##############################################################################################
 Please read the menu options carefully before selection as our menu option may have changed. #
##############################################################################################
Choose one of the following option:
1  - Run Daily Maintenance
2  - Run Weekly Maintenance
3  - Run Catalog Maintenance
4  - Check and fix catalog corruption
5  - DCA Health Report
6  - Database State Monitoring Report
7  - Database Alert Log Monitoring Report
8  - DCA Space Monitoring Report
9  - Missing Table Statistics Report
10 - Bloat Check 
11 - Skew Check 
12 - Check locks on table
13 - Check tables with RANDOM distribution policy
14 - Check GUC parameter value
15 - Backup database using ddboost
16 - Archive log files older than threshold limit
17 - Cleanup orphaned temp schema
18 - Display DCA master Disk Usage
19 - Display DCA master CPU Utilization
20 - Display DCA master Memory Utilization
21 - Schema Space Monitoring Report 
22 - DCA, Database, OS version Report 
23 - Check tables distribution policy
24 - Check table skew
25 - Check running queries
26 - Check resource queue settings
27 - TOP 20 resource intensive SQL 2 days
28 - Most active users
29 - Most used resource queues
##############################################################################################
97 - Help and Support
98 - Show configuration setting
99 - Logout 
##############################################################################################
Enter your option: 

Reading QUERY PLAN output in Greenplum

posted Feb 9, 2014, 12:54 PM by Sachchida Ojha

Query plans are a tree plan of nodes. Each node in the plan represents a single operation, such as table scan, join, aggregation or a sort. Plans should be read from the bottom up as each node feeds rows into the node directly above it. The bottom nodes of a plan are usually table scan operations (sequential, index or bitmap index scans). If the query requires joins, aggregations, or sorts (or other operations on the raw rows) then there will be additional nodes above the scan nodes to perform these operations. The topmost plan nodes are usually the Greenplum Database motion nodes (redistribute, explicit redistribute, broadcast, or gather motions). These are the operations responsible for moving rows between the segment instances during query processing.

*** Read more about different type of motion such as redistribute, explicit redistribute, broadcast, or gather motions, visit www.greenplumdba.com.

The output of EXPLAIN has one line for each node in the plan tree, showing the basic node type plus the following cost estimates that the planner made for the execution of that plan node:

1. cost - measured in units of disk page fetches; that is, 1.0 equals one sequential disk page read. The first estimate is the start-up cost (cost of getting to the first
row) and the second is the total cost (cost of getting all rows). Note that the total cost assumes that all rows will be retrieved, which may not always be the case (if
using LIMIT for example).

2. rows - the total number of rows output by this plan node. This is usually less than the actual number of rows processed or scanned by the plan node, reflecting the
estimated selectivity of any WHERE clause conditions. Ideally the top-level nodes estimate will approximate the number of rows actually returned, updated, or
deleted by the query.

3. width - total bytes of all the rows output by this plan node.

It is important to note that the cost of an upper-level node includes the cost of all its child nodes. The topmost node of the plan has the estimated total execution cost for the plan. This is this number that the planner seeks to minimize. It is also important to realize that the cost only reflects things that the query planner cares about. In particular, the cost does not consider the time spent transmitting result rows to the client.

If a query is performing poorly, looking at its query plan can help identify problem areas. Here are some things to look for:

1. Is there one operation in the plan that is taking exceptionally long? When looking through the query plan, is there one operation that is consuming the majority of the query processing time? For example, if an index scan is taking longer than expected, perhaps the index is out-of-date and needs to be reindexed. You could also temporarily experiment with the enable_ parameters to see if you can force the planner to choose a different (and potentially better) plan by disabling a particular query plan operator for that query.

2. Are the planner’s estimates close to reality? Run an EXPLAIN ANALYZE and see if the number of rows estimated by the planner is close to the number of rows actually returned by the query operation. If there is a huge discrepancy, you may need to collect more statistics on the relevant columns. 

3.Are selective predicates applied early in the plan? The most selective filters should be applied early in the plan so that less rows move up the plan tree. If the query plan is not doing a good job at estimating the selectivity of a query predicate, you may need to collect more statistics on the relevant columns. You can also try reordering the WHERE clause of your SQL statement.

4. Is the planner choosing the best join order? When you have a query that joins multiple tables, make sure that the planner is choosing the most selective join
order. Joins that eliminate the largest number of rows should be done earlier in the plan so that less rows move up the plan tree. If the plan is not choosing the optimal join order, you can set join_collapse_limit=1 and use explicit JOIN syntax in your SQL statement to force the planner to the specified join order. You can also collect more statistics on the relevant join columns.

5. Is the planner selectively scanning partitioned tables? If you are using table partitioning, is the planner selectively scanning only the child tables required to satisfy the query predicates? Do scans of the parent tables return 0 rows (they should, since the parent tables should not contain any data).

6. Is the planner choosing hash aggregate and hash join operations where applicable? Hash operations are typically much faster than other types of joins or aggregations. Row comparison and sorting is done in memory rather than reading/writing from disk. In order for hash operations to be chosen, there has to be sufficient work memory available to hold the number of estimated rows. Try increasing work memory to see if you can get better performance for a given query. If possible run an EXPLAIN ANALYZE for the query, which will show you which plan operations spilled to disk, how much work memory they used, and how much was required to not spill to disk. For example:

Work_mem used: 23730K bytes avg, 23470K bytes max (seg0).
Work_mem wanted: 23659K bytes avg, 23659K bytes max (seg0) to lessen workfile I/O affecting 2 workers.

Note that the bytes wanted message from EXPLAIN ANALYZE is only a hint, based on the amount of data written to work files and is not exact. The minimum work_mem needed could be more or less than the suggested value.

Identifying Statistics Problems in Query Plans
When looking at the query plan for a query using EXPLAIN or EXPLAIN ANALYZE, it helps to know your data in order to identify possible statistics problems. Check the plan for the following indicators of inaccurate statistics:
1. Are the planner’s estimates close to reality? Run an EXPLAIN ANALYZE and see if the number of rows estimated by the planner is close to the number of rows actually returned by the query operation.
2. Are selective predicates applied early in the plan? The most selective filters should be applied early in the plan so that less rows move up the plan tree.
3. Is the planner choosing the best join order? When you have a query that joins multiple tables, make sure that the planner is choosing the most selective join order. Joins that eliminate the largest number of rows should be done earlier in the plan so that less rows move up the plan tree.

Reclaim all expired row space in Greenplum

posted Feb 9, 2014, 12:39 PM by Sachchida Ojha

A VACUUM FULL will reclaim all expired row space, but is a very expensive operation and may take an unacceptably long time to finish on large, distributed Greenplum Database tables. If you do get into a situation where the free space map has overflowed, it may be more timely to recreate the table with a CREATE TABLE AS statement and drop the old table. A VACUUM FULL is not recommended in Greenplum Database.

Option 1.
ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
This leaves any indexes etc. in tact but rewrites the underlying table and, hence, eliminates all dead data.

Option 2
SET gp_auto_stats_mode = none;
CREATE TABLE xyz AS SELECT * FROM abc <ADD YOUR DISTRIBUTION AND STORAGE OPTIONS HERE>;
DROP TABLE abc;
ALTER TABLE xyz RENAME TO abc;
ANALYZE abc;
ALTER TABLE abc ADD PRIMARY KEY (<YOUR_PK_COLUMN_NAME>);

This option writes the data one time.

Option 3
Create temp table t1 as select * from <Table Name>;
Truncate <Table Name>;
set gp_autostats_mode = none;
Insert into <Table Name> select * from t1;
Drop table t1;
analyze <Table Name>;

This option writes the data 2 times.

1-10 of 15