Greenplum 4.1.1.0: Segments failed with "stuck spinlock" error during concurrent IO activity on the database.

posted Apr 28, 2017, 4:55 PM by Sachchida Ojha
Error message in pg_log/* on one of the segment hosts. 

Segment data directories can be found using the SQL command "select hostname, fselocation from gp_segment_configuration left outer join pg_filespace_entry on dbid=fsedbid;"

2011-05-04 08:28:43.410381 PDT,,,p17723,th-529366144,,,,0,,,seg-1,,,,,"PANIC","XX000","stuck spinlock (0x2aaaed6451d0) detected at faultinjector.c:301 (s_lock.c:39)",,,,,,,0,,"s_lock.c",39,"Stack trace:
1 0xa3328a postgres <symbol not found> (elog.c:454)
2 0xa34ff8 postgres elog_finish (elog.c:1365)
3 0x8d1540 postgres s_lock (s_lock.c:39)
4 0xa6b3dc postgres FaultInjector_InjectFaultIfSet (faultinjector.c:651)
5 0x899163 postgres FileSync (fd.c:251)
6 0xafd228 postgres <symbol not found> (cdbfilerep.c:1720)
7 0xb061ae postgres FileRep_Main (cdbfilerep.c:1344)
8 0x584ce8 postgres AuxiliaryProcessMain (bootstrap.c:483)
9 0x857f4a postgres StartFilerepProcesses (postmaster.c:7062)
10 0x85ae31 postgres doRequestedPrimaryMirrorModeTransitions (primary_mirror_mode.c:1652)
11 0x856f62 postgres PostmasterMain (postmaster.c:2287)
12 0x76521a postgres main (main.c:212)
13 0x342521d994 libc.so.6 __libc_start_main (??:0)14 0x474f39 postgres memcpy (??:0)

By design, the code acquires a spinlock to look up certain data structures. Under concurrent inserts, the process take an elongated amount of time to acquire the spinlock. The spinlock code issues a Panic due to the fact that the spinlock is not acquired in the required timeframe.

The cluster can go in to change tracking causing the current transaction to fail "ERROR: GPDB performed segment reconfiguration" which then triggers a Spin Lock error on the mirror Logs.

When this error happens on a segment, the segment exits violently (PANIC on postmaster) with no reporting messages apart from the PANIC ones. FTS will then notice that the segment went away and transition the peer segment to change-tracking.

Verify Greenplum Database Version Installed:

To verify the version and build of the Greenplum Database installed run the below command.
Log in as a known pg/gp user with env initialized.

Execute the command psql -c "select version()"

Example:

PostgreSQL 8.2.15 (Greenplum Database 4.1.1.0 build 8) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on May1 2011 23:08:09

Permanent Fix:

This issue does not occur in Greenplum Database 4.1.0 or 4.0.5.1. 

This build of 4.1.1 is being removed from the Download Center, along with the release notes.

Users are encouraged to install Greenplum Database 4.1.0 or upgrade to Greenplum Database 4.1.0.

Greenplum DCA customers should install the GreenPlum Database 4.0.5.1 or upgrade to GreenPlum Database 4.0.5.1, which is the qualified release for the DCA.
Comments