Re: patch: fix SSI finished list corruption

Lists: pgsql-hackers
From: Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: patch: fix SSI finished list corruption
Date: 2012-01-07 00:15:25
Message-ID: 20120107001524.GK11222@csail.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

There's a corner case in the SSI cleanup code that isn't handled
correctly. It can arise when running workloads that are comprised
mostly (but not 100%) of READ ONLY transactions, and can corrupt the
finished SERIALIZABLEXACT list, potentially causing a segfault. The
attached patch fixes it.

Specifically, when the only remaining active transactions are READ
ONLY, we do a "partial cleanup" of committed transactions because
certain types of conflicts aren't possible anymore. For committed r/w
transactions, we release the SIREAD locks but keep the
SERIALIZABLEXACT. However, for committed r/o transactions, we can go
further and release the SERIALIZABLEXACT too. The problem was with the
latter case: we were returning the SERIALIZABLEXACT to the free list
without removing it from the finished list.

The only real change in the patch is the SHMQueueDelete line, but I
also reworked some of the surrounding code to make it obvious that r/o
and r/w transactions are handled differently -- the existing code felt
a bit too clever.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

Attachment Content-Type Size
ssi-partial-cleanup.patch text/x-diff 0 bytes

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Subject: Re: patch: fix SSI finished list corruption
Date: 2012-01-18 16:01:56
Message-ID: 4F16ECF4.7070304@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07.01.2012 02:15, Dan Ports wrote:
> There's a corner case in the SSI cleanup code that isn't handled
> correctly. It can arise when running workloads that are comprised
> mostly (but not 100%) of READ ONLY transactions, and can corrupt the
> finished SERIALIZABLEXACT list, potentially causing a segfault. The
> attached patch fixes it.
>
> Specifically, when the only remaining active transactions are READ
> ONLY, we do a "partial cleanup" of committed transactions because
> certain types of conflicts aren't possible anymore. For committed r/w
> transactions, we release the SIREAD locks but keep the
> SERIALIZABLEXACT. However, for committed r/o transactions, we can go
> further and release the SERIALIZABLEXACT too. The problem was with the
> latter case: we were returning the SERIALIZABLEXACT to the free list
> without removing it from the finished list.
>
> The only real change in the patch is the SHMQueueDelete line, but I
> also reworked some of the surrounding code to make it obvious that r/o
> and r/w transactions are handled differently -- the existing code felt
> a bit too clever.

Thanks, committed!

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com