Re: XLogInsert scaling, revisited

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogInsert scaling, revisited
Date: 2013-07-08 10:43:27
Message-ID: 20130708104327.GA7242@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-07-08 10:45:41 +0300, Heikki Linnakangas wrote:
> On 01.07.2013 16:40, Andres Freund wrote:
> >On 2013-06-26 18:52:30 +0300, Heikki Linnakangas wrote:
> >>>* Could you document the way slots prevent checkpoints from occurring
> >>> when XLogInsert rechecks for full page writes? I think it's correct -
> >>> but not very obvious on a glance.
> >>
> >>There's this in the comment near the top of the file:
> >>
> >> * To update RedoRecPtr or fullPageWrites, one has to make sure that all
> >> * subsequent inserters see the new value. This is done by reserving all the
> >> * insertion slots before changing the value. XLogInsert reads RedoRecPtr
> >>and
> >> * fullPageWrites after grabbing a slot, so by holding all the slots
> >> * simultaneously, you can ensure that all subsequent inserts see the new
> >> * value. Those fields change very seldom, so we prefer to be fast and
> >> * non-contended when they need to be read, and slow when they're changed.
> >>
> >>Does that explain it well enough? XLogInsert holds onto a slot while it
> >>rechecks for full page writes.
> >
> >I am a bit worried about that logic. We're basically reverting to the
> >old logic whe xlog writing is an exlusive affair. We will have to wait
> >for all the other queued inserters before we're finished. I am afraid
> >that that will show up latencywise.
>
> A single stall of the xlog-insertion "pipeline" at a checkpoint is hardly
> going to be a problem. I wish PostgreSQL was real-time enough for that to
> matter, but I think we're very very far from that.

Well, the stall won't necessarily be that short. There might be several
backends piling on every insertion slot and waiting - and thus put to
sleep by the kerenl. I am pretty sure it's easy enough to get stalls in
the second range that way.

Sure, there are lots of reasons we don't have all that reliable response
times, but IME the amount of response time jitter is one of the bigger
pain points of postgres. And this feature has a good chance of reducing
that pain noticeably...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hari Babu 2013-07-08 10:47:54 Re: Review: Patch to compute Max LSN of Data Pages
Previous Message Heikki Linnakangas 2013-07-08 10:38:44 Re: XLogInsert scaling, revisited