Automating Greenplum database Parallel Backups with gpcrondump

posted Sep 12, 2012, 1:56 PM by Sachchida Ojha
gpcrondump is a wrapper utility for gp_dump, which can be called directly or from a crontab entry. It also allows you to backup additional objects besides your databases and data, such as database roles and server configuration files.

gpcrondump creates the dump files in the master and each segment instance’s data directory in <data_directory>/db_dumps/YYYYMMDD. The segment data dump files are compressed using gzip.

To schedule a dump operation using CRON

1. On the master, log in as the Greenplum superuser (gpadmin).

2. Define a crontab entry that calls gpcrondump. For example, to schedule a nightly dump of the sales database at one minute past midnight (note that the SHELL is set to /bin/bash and the PATH includes the location of the Greenplum Database management utilities):

Linux Example:

01 0 * * * gpadmin gpcrondump -x sales -c -g -G -a -q >> gp_salesdump.log

Solaris Example (no line breaks):

01 0 * * * SHELL=/bin/bash GPHOME=/usr/local/greenplum-db-4.0.x.x PATH=$PATH:$GPHOME/bin HOME=/export/home/gpadmin MASTER_DATA_DIRECTORY=/data/gpdb_p1/gp-1
/usr/local/greenplum-db/bin/gpcrondump -x sales -c -g -G -a -q >> gp_salesdump.log

3.Create a file named mail_contacts in either the Greenplum superuser’s home directory or in $GPHOME/bin. For example:

$ vi /home/gpadmin/mail_contacts
$ vi /export/home/gpadmin/mail_contacts

4.In this file, type one email address per line. For example:

5.Save and close the mail_contacts file. gpcrondump will send email notifications to the email addresses listed in this file.