How to troubleshoot Greenplum file distribution program (gpfdist) issues

posted Apr 28, 2017, 4:53 PM by Sachchida Ojha
Greenplum file distribution program (gpfdist) runs on the host where the external data files reside. This program points to a given directory on the file host and serves external data files to all Greenplum Database segments in parallel.

If you encounter issues with gpfdist, refer to the items listed below before contacting Greenplum customer support.
Is gpfdist the same version as the database?

$ gpfdist --version
gpfdist version "3.2.1.0 build 7"
$ psql -c 'select version();'
PostgreSQL 8.2.10 (Greenplum Database 3.2.1.0 build 7)

Can you get the file from the ETL host to the Greenplum array using WGET or CURL?
$ wget http://MACHINE:8080/filename

Notes:
 
Solaris - You may not have wget set in your path so you need to find it (export PATH=/usr/sfw/bin:/usr/opt/bin:/usr/bin:/opt/sfw/bin:/usr/local/bin:/usr/sbin:/usr/ccs/bin:$PATH) 

If you do not have wget, use curl$:

curl -s -S ’http://MACHINE:8080/filename’
If you are not able to get the file using wget (curl), address issues with the network before attempting another gpfdistoperation.

Error messages like 'invalid syntax' or inconsistent results when reading from External Table and using LIMIT clause:
Make sure you are not using 'limit x' in your query. When gpfdist hands out data to the backend it does so on a "first come first serve" basis. Unlike COPY or external table 'file' protocol, where the data processing is more sequential, with gpfdist you can never know which segdb will process which chunk of the data, nor when. It is essentially a race. If there is a bad data row in your data, it may or may not show up in your LIMIT query, because it depends on whether the segdb that received the chunk of data with that bad row managed to process it in time, before the other segdbs processed their chunks of data, and before the executor shuts down because LIMIT is satisfied.
 
Getting "ERROR: Out of memory" while reading from External Table

If you are using error tables for rejected rows in your external tables DDL, and during your ETL process many bad rows are rejected, you might run into a known memory leak issue while using Single Row Error Handling (SREH).

Affected versions: 3.1.x
This issue is fixed in 3.2.0.0 and later versions.

Note: It is required to have connectivity between the ETL host and the segment nodes.
Comments