Greenplum DBA - Understanding the Greenplum database architecture

Greenplum Database System Architecture
(Learn more! visit www.greenplumdba.com. available to registered users only. Click here to become DBARef.com registered user.)
Built to support Big Data Analytics, Greenplum Database manages, stores, and analyzes Terabytes to Petabytes of data. Users experience 10 to 100 times better performance over traditional RDBMS products – a result of Greenplum’s shared-nothing MPP architecture, high-performance parallel dataflow engine, and advanced gNet software interconnect technology. Greenplum Database was conceived, designed, and engineered to allow customers to take advantage of large clusters of increasingly powerful, increasingly inexpensive commodity servers, storage, and Ethernet switches. Greenplum customers can gain immediate benefit from deploying the latest commodity hardware innovations.
1. Massively Parallel Processing Architecture for Loading and Query Processing
2. Polymorphic Data Storage-MultiStorage/SSD Support
3. Multi-level Partitioning with Dynamic Partitioning Elimination
4. Out-of-the-Box Support for Big Data Analytics?
5. High Performance gNet™ for Hadoop
6. Analytics and Language Support?
7. Dynamic Query Prioritization?
8. Self-Healing Fault Tolerance and Online Segment Rebalancing
9. Simpler, Scalable Backup with Data Domain Boost
10. Health Monitoring and Alerting
Components that comprise a Greenplum Database system, and how they work together
  • The Greenplum Master
  • The Greenplum Segments
  • The Greenplum Interconnect
  • Redundancy and Failover in Greenplum Database
  • Parallel Data Loading
  • Management and Monitoring
Management and Monitoring
Management of a Greenplum Database system is performed using a series of command-line utilities, which are located in $GPHOME/bin. Greenplum provides utilities for the following Greenplum Database administration tasks:
  • Installing Greenplum Database on an Array
  • Initializing a Greenplum Database System
  • Starting and Stopping Greenplum Database
  • Adding or Removing a Host
  • Expanding the Array and Redistributing Tables among New Segments
  • Managing Recovery for Failed Segment Instances
  • Managing Failover and Recovery for a Failed Master Instance
  • Backing Up and Restoring a Database (in Parallel)
  • Loading Data in Parallel
  • System State Reporting


Greenplum also provides an optional system monitoring and management tool that administrators can install and enable with Greenplum Database. Greenplum Command Center uses data collection agents on each segment host to collect and store Greenplum system metrics in a dedicated database. Segment data collections agents send their data to the Greenplum master at regular intervals (typically every 15 seconds). Users can query the Command Center database to see query and system metrics. Greenplum Command Center also has a graphical web-based user interface for viewing these system metrics, which can be installed separately from Greenplum Database.

The gadget spec URL could not be found

The gadget spec URL could not be found

Comments