What is GPDB Master Hosts?

posted Sep 12, 2012, 9:00 AM by Sachchida Ojha
The master is the entry point to the Greenplum Database system from the public LAN. For systems that wish to use the automated master server failover, a virtual IP will be configured - client tools should point to this IP. It is the database process that accepts client connections and processes the SQL commands issued by the users of the system. Users connect to Greenplum Database through the master using PostgreSQL-compatible client programs such as psql or ODBC.
The master maintains the system catalog (a set of system tables that contain metadata about the Greenplum Database system itself), however the master does not contain any user data. Data resides only on the segments. The master does the work of authenticating client connections, processing and planning the incoming SQL commands, distributing the work load between the segments, coordinating the results returned by each of the segments, and presenting the final results to the client program.

Master Redundancy - The Standby Master

Greenplum DCA also has a standby master host to serve as a backup in case the primary master becomes unoperational. The standby master can be setup to automatically promote itself to the primary master in the event of a failure. By default, automatic master server failover is turned off.
The standby master is kept up to date by a transaction log replication process, which runs on the standby master host and keeps the data between the primary and standby master hosts synchronized. If the primary master fails, the log replication process is shutdown, and the standby master can be activated in its place. Upon activation of the standby master, the replicated logs are used to reconstruct the state of the master host at the time of the last successfully committed transaction.