Re: XLogInsert scaling, revisited

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogInsert scaling, revisited
Date: 2013-06-24 18:01:01
Message-ID: 20130624180101.GJ6471@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-22 14:32:46 +0300, Heikki Linnakangas wrote:
> Attached is a new version that fixes at least the problem I saw. Not sure if
> it fixes what you saw, but it's worth a try. How easily can you reproduce
> that?

Ok, I started to look at this:

* Could you document the way slots prevent checkpoints from occurring
when XLogInsert rechecks for full page writes? I think it's correct -
but not very obvious on a glance.

* The read of Insert->RedoRecPtr while rechecking whether it's out of
date now is unlocked, is that correct? And why?

* Have you measured whether it works to just keep as many slots as
max_backends requires around and not bothering with dynamically
allocating them to inserters?
That seems to require to keep actually waiting slots in a separate
list which very well might make that too expensive.

Correctness wise the biggest problem for that probably is exlusive
acquiration of all slots CreateCheckpoint() does...

* What about using some sort of linked list of unused slots for
WALInsertSlotAcquireOne? Everytime we're done we put it to the *end*
of the list so it's less likely to have been grabbed by somebody else
so we can reuse it.
a) To grab a new one go to the head of the linked list spinlock it and
recheck whether it's still free. If not, restart. Otherwise, remove
from list.
b) To reuse a previously used slot

That way we only have to do the PGSemaphoreLock() dance if there
really aren't any free slots.

* The queuing logic around slots seems to lack documentation. It's
complex enough to warrant that imo.

* Not a big fan of the "holdingAll" variable name, for a file-global one
that seems a bit too generic.

* Could you add a #define or comment for the 64 used in
XLogInsertSlotPadded? Not everyone might recognize that immediately as
the most common cacheline size.
Commenting on the reason we pad in general would be a good idea as
well.

* Is it correct that WALInsertSlotAcquireOne() resets xlogInsertingAt of
all slots *before* it has acquired locks in all of them? If yes, why
haven't we already reset it to something invalid?

* Is GetXLogBuffer()'s unlocked read correct without a read barrier?

* XLogBytePosToEndRecPtr() seems to have a confusing name to me. At
least the comment needs to better explain that it's named that way
because of the way it's used.
Also, doesn't
seg_offset += fullpages * XLOG_BLCKSZ + bytesleft + SizeOfXLogShortPHD;
potentially point into the middle of a page?

* I wish we could get rid of the bytepos notion, it - while rather
clever - complicates things. But that's probably not going to happen
unless we get rid of short/long page headers :/

Cool stuff!

Greetings,

Andres Freund

PS: Btw, git diff|... -w might be more helpful than not indenting a
block.

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2013-06-24 18:13:43 Re: [9.4 CF 1] The Commitfest Slacker List
Previous Message Andres Freund 2013-06-24 17:59:37 Re: [9.4 CF 1] The Commitfest Slacker List