pgsql: Make the visibility map crash-safe.

Lists: pgsql-committerspgsql-hackers
From: Robert Haas <rhaas(at)postgresql(dot)org>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Make the visibility map crash-safe.
Date: 2011-06-22 03:05:57
Message-ID: E1QZDlN-0000z9-21@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Make the visibility map crash-safe.

This involves two main changes from the previous behavior. First,
when we set a bit in the visibility map, emit a new WAL record of type
XLOG_HEAP2_VISIBLE. Replay sets the page-level PD_ALL_VISIBLE bit and
the visibility map bit. Second, when inserting, updating, or deleting
a tuple, we can no longer get away with clearing the visibility map
bit after releasing the lock on the corresponding heap page, because
an intervening crash might leave the visibility map bit set and the
page-level bit clear. Making this work requires a bit of interface
refactoring.

In passing, a few minor but related cleanups: change the test in
visibilitymap_set and visibilitymap_clear to throw an error if the
wrong page (or no page) is pinned, rather than silently doing nothing;
this case should never occur. Also, remove duplicate definitions of
InvalidXLogRecPtr.

Patch by me, review by Noah Misch.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/503c7305a1e379f95649eef1a694d0c1dbdc674a

Modified Files
--------------
src/backend/access/heap/heapam.c | 263 ++++++++++++++++++++++++++++---
src/backend/access/heap/hio.c | 48 ++++++-
src/backend/access/heap/visibilitymap.c | 110 +++++++------
src/backend/access/transam/transam.c | 5 +-
src/backend/access/transam/xlog.c | 3 -
src/backend/commands/vacuumlazy.c | 8 +-
src/include/access/heapam.h | 2 +
src/include/access/hio.h | 3 +-
src/include/access/htup.h | 10 ++
src/include/access/transam.h | 3 +
src/include/access/visibilitymap.h | 6 +-
src/include/access/xlog_internal.h | 2 +-
12 files changed, 373 insertions(+), 90 deletions(-)


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <rhaas(at)postgresql(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-22 10:55:35
Message-ID: 4E01CA27.6010102@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On 22.06.2011 06:05, Robert Haas wrote:
> Second, when inserting, updating, or deleting
> a tuple, we can no longer get away with clearing the visibility map
> bit after releasing the lock on the corresponding heap page, because
> an intervening crash might leave the visibility map bit set and the
> page-level bit clear.

heap_update seems to still do that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <rhaas(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-22 13:12:36
Message-ID: BANLkTin44Ur6sWMYVsGoSz-w_yJbOf5Dkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Jun 22, 2011 at 6:55 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 22.06.2011 06:05, Robert Haas wrote:
>>
>> Second, when inserting, updating, or deleting
>> a tuple, we can no longer get away with clearing the visibility map
>> bit after releasing the lock on the corresponding heap page, because
>> an intervening crash might leave the visibility map bit set and the
>> page-level bit clear.
>
> heap_update seems to still do that.

Aw, crap. I'll take another look.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <rhaas(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-23 02:23:39
Message-ID: BANLkTi=gfQodjQGRxPUOcHon=dvW=LZ76Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Jun 22, 2011 at 9:12 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 22, 2011 at 6:55 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 22.06.2011 06:05, Robert Haas wrote:
>>>
>>> Second, when inserting, updating, or deleting
>>> a tuple, we can no longer get away with clearing the visibility map
>>> bit after releasing the lock on the corresponding heap page, because
>>> an intervening crash might leave the visibility map bit set and the
>>> page-level bit clear.
>>
>> heap_update seems to still do that.
>
> Aw, crap.  I'll take another look.

Well, it seems I didn't put nearly enough thought into heap_update().
The fix for the immediate problem looks simple enough - all the code
has been refactored to use the new API, so the calls can be easily be
moved into the critical section (see attached). But looking at this a
little more, I see that heap_update() is many bricks short of a load,
because there are several places where the buffer can be unlocked and
relocked, and we don't recheck whether the page is all-visible after
reacquiring the lock. So I've got some more work to do here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
heap-update-vm-fix.patch application/octet-stream 1.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-23 13:22:40
Message-ID: BANLkTik_oywk8_oZtB9mkOoh1ChkKFtvVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Wed, Jun 22, 2011 at 10:23 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Well, it seems I didn't put nearly enough thought into heap_update().
> The fix for the immediate problem looks simple enough - all the code
> has been refactored to use the new API, so the calls can be easily be
> moved into the critical section (see attached).  But looking at this a
> little more, I see that heap_update() is many bricks short of a load,
> because there are several places where the buffer can be unlocked and
> relocked, and we don't recheck whether the page is all-visible after
> reacquiring the lock.  So I've got some more work to do here.

See what you think of the attached. I *think* this covers all bases.
It's a little more complicated than I would like, but I don't think
fatally so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
heap-update-vm-fix-v2.patch application/octet-stream 8.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-27 18:02:20
Message-ID: BANLkTimX2BwkTKbvOOVyaZXsAJEenajm_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, Jun 23, 2011 at 9:22 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 22, 2011 at 10:23 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Well, it seems I didn't put nearly enough thought into heap_update().
>> The fix for the immediate problem looks simple enough - all the code
>> has been refactored to use the new API, so the calls can be easily be
>> moved into the critical section (see attached).  But looking at this a
>> little more, I see that heap_update() is many bricks short of a load,
>> because there are several places where the buffer can be unlocked and
>> relocked, and we don't recheck whether the page is all-visible after
>> reacquiring the lock.  So I've got some more work to do here.
>
> See what you think of the attached.  I *think* this covers all bases.
> It's a little more complicated than I would like, but I don't think
> fatally so.

For lack of comment, committed. It's hopefully at least better than
what was there before, which was clearly several bricks short of a
load.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-28 15:44:07
Message-ID: BANLkTin0K6Po_hbiNxL42sEDwhVmm69OnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

2011/6/27 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Jun 23, 2011 at 9:22 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Jun 22, 2011 at 10:23 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Well, it seems I didn't put nearly enough thought into heap_update().
>>> The fix for the immediate problem looks simple enough - all the code
>>> has been refactored to use the new API, so the calls can be easily be
>>> moved into the critical section (see attached).  But looking at this a
>>> little more, I see that heap_update() is many bricks short of a load,
>>> because there are several places where the buffer can be unlocked and
>>> relocked, and we don't recheck whether the page is all-visible after
>>> reacquiring the lock.  So I've got some more work to do here.
>>
>> See what you think of the attached.  I *think* this covers all bases.
>> It's a little more complicated than I would like, but I don't think
>> fatally so.
>
> For lack of comment, committed.  It's hopefully at least better than
> what was there before, which was clearly several bricks short of a
> load.

out of curiosity, does it affect the previous benchmarks of the feature ?

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-28 16:29:52
Message-ID: BANLkTi=OMvhgvSe6_d985iS7-2dm3LYNpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Jun 28, 2011 at 11:44 AM, Cédric Villemain
<cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
> out of curiosity, does it affect the previous benchmarks of the feature ?

I don't think there's much performance impact, because the only case
where we actually have to do any real work is when vacuum comes
through and marks a buffer we're about to update all-visible. It
would probably be good to run some tests to verify that, though. I'll
try to set something up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-30 11:50:20
Message-ID: BANLkTi=+VVPrRYu+5N3gugnWEQZ-dsLA5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Tue, Jun 28, 2011 at 12:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 28, 2011 at 11:44 AM, Cédric Villemain
> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>> out of curiosity, does it affect the previous benchmarks of the feature ?
>
> I don't think there's much performance impact, because the only case
> where we actually have to do any real work is when vacuum comes
> through and marks a buffer we're about to update all-visible.  It
> would probably be good to run some tests to verify that, though.  I'll
> try to set something up.

I did some pgbench runs on the 32-core box Nate Boley gave me access
to. I'm not sure that pgbench is the best workload to test this with,
but I gave it a shot. I used these settings:

checkpoint_segments=64
shared_buffers=8GB
synchronous_commit=off
checkpoint_completion_target=0.9

I compare the performance of commit
431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (post-patch) with commit
431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (pre-patch). I tried 3
15-minute pgbench runs each with one client, eight clients, and
thirty-two clients, on each branch. I used scale factor 100,
reinitializing the entire cluster after each run.

one client (prepatch)
tps = 616.412009 (including connections establishing)
tps = 619.189040 (including connections establishing)
tps = 605.357710 (including connections establishing)

one client (postpatch)
tps = 583.295139 (including connections establishing)
tps = 609.236975 (including connections establishing)
tps = 597.674354 (including connections establishing)

eight clients (prepatch)
tps = 2635.459096 (including connections establishing)
tps = 2468.973962 (including connections establishing)
tps = 2592.889454 (including connections establishing)

eight clients (postpatch)
tps = 2602.226439 (including connections establishing)
tps = 2644.201228 (including connections establishing)
tps = 2725.760364 (including connections establishing)

thirty-two clients (prepatch)
tps = 3889.761627 (including connections establishing)
tps = 4365.501093 (including connections establishing)
tps = 3816.119328 (including connections establishing)

thirty-two clients (postpatch)
tps = 3888.620044 (including connections establishing)
tps = 3614.252463 (including connections establishing)
tps = 4090.430296 (including connections establishing)

I think it's pretty difficult to draw any firm conclusions from this
data. The one and thirty-two core results came out a bit slower; the
eight core results came out a bit faster. But the variation between
branches is less than the inter-run variation on the same branch, so
if there is an effect, this test doesn't clearly capture it.

Having thought about it a bit, I think that if we ran this test for
long enough, we probably could measure a small effect. pgbench is a
very write-intensive test, so anything that writes additional WAL
records is going cause checkpoints to happen more frequently, and
there's no denying that this change increases WAL volume slightly.

On another note, these results are also interesting from a scalability
perspective. The eight-core results are about 4.4x the one-core
results, and the 32-client results are about 6.5x the one-core
results. Needless to say, there's a lot of room for improvement
there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-06-30 22:19:09
Message-ID: 1309472349.31501.1.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, 2011-06-30 at 07:50 -0400, Robert Haas wrote:
> I compare the performance of commit
> 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (post-patch) with commit
> 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (pre-patch).

I believe that is a copy/paste error.

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Make the visibility map crash-safe.
Date: 2011-07-03 21:24:27
Message-ID: CA+TgmoZEruPUpFv-_bDzGtuGH2iR=N8=m3vUj9YKGbxgxmA_dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

On Thu, Jun 30, 2011 at 6:19 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Thu, 2011-06-30 at 07:50 -0400, Robert Haas wrote:
>> I compare the performance of commit
>> 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (post-patch) with commit
>> 431ab0e82819b31fcd1e33ecb52c2cd3b4b41da7 (pre-patch).
>
> I believe that is a copy/paste error.

Yeah, it sure is. I think (but am not 100% certain) that the
post-patch version was actually
21f1e15aafb13ab2430e831a3da7d4d4f525d1ce.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company