Restoring a Greenplum Database with gp_restore

posted Sep 12, 2012, 2:06 PM by Sachchida Ojha
The gp_restore utility recreates the data definitions (schema) and user data in a database using the backup files created by a gp_dump operation. To do a restore, you must have:

1.Backup files created by a gp_dump operation.

2.The backup files reside on the segment hosts in the location where gp_dump created them.

3.The Greenplum Database system up and running.

4.A Greenplum Database system with the exact same number of primary segment instances as the system that was backed up using gp_dump.

5.The database you are restoring to is created in the system.

6.If you used the options -s (schema only), -a (data only), --gp-c (compressed), --gp-d (alternate dump file location) when performing the gp_dump operation, you must specify these options when doing the gp_restore as well.

The gp_restore utility performs the following actions:

On the master host
1. Runs the SQL DDL commands in the gp_dump_1_<dbid>_<timestamp> file created by gp_dump to recreate the database schema and objects.

2. Creates a log file in the master data directory named gp_restore_status_1_<dbid>_<timestamp>.

3.gp_restore launches a gp_restore_agent for each segment instance to be restored. gp_restore_agent processes run on the segment hosts and report status back to the gp_restore process running on the master host.

On the segment hosts

1. Restores the user data for each segment instance using the gp_dump_1_<dbid>_<timestamp> files created by gp_dump. Each primary and mirror segment instance on a host is restored.

2. Creates a log file for each segment instance named gp_restore_status_1_<dbid>_<timestamp>.

Note that the 14 digit timestamp is the number that uniquely identifies the backup job to be restored, and is part of the filename for each dump file created by a gp_dump operation. This timestamp must be passed to the gp_restore utility when restoring a database.

To restore from a backup created by gp_dump

1.Make sure the backup files created by gp_dump reside on the master host and segment hosts for the Greenplum Database system you are restoring.

2.Make sure the database you are restoring to has been created in the system. For example:

$ createdb database_name

3.From the master, run the gp_restore utility. For example (where --gp-k specifies the timestamp key of the backup job and -d specifies the database to connect to):

$ gp_restore -gp-k=2007103112453 -d database_name