Re: WIP patch for parallel pg_dump

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP patch for parallel pg_dump
Date: 2010-12-05 18:28:47
Message-ID: 27542.1291573727@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith <greg(at)2ndquadrant(dot)com> writes:
> In addition, Joachim submitted a synchronized snapshot patch that looks
> to me like it slipped through the cracks without being fully explored.
> ...
> The way I read that thread, there were two objections:

> 1) This mechanism isn't general enough for all use-cases outside of
> pg_dump, which doesn't make it wrong when the question is how to get
> parallel pg_dump running

> 2) Running as superuser is excessive. Running as the database owner was
> suggested as likely to be good enough for pg_dump purposes.

IIRC, in old discussions of this problem we first considered allowing
clients to pull down an explicit representation of their snapshot (which
actually is an existing feature now, txid_current_snapshot()) and then
upload that again to become the active snapshot in another connection.
That was rejected on the grounds that you could cause all kinds of
mischief by uploading a bad snapshot; so we decided to think about
providing a server-side-only means to clone another backend's current
snapshot. Which is essentially what Joachim's above-mentioned patch
provides. However, as was discussed in that thread, that approach is
far from being ideal either.

I'm wondering if we should reconsider the pass-it-through-the-client
approach, because if we could make that work it would be more general and
it wouldn't need any special privileges. The trick seems to be to apply
sufficient sanity testing to the snapshot proposed to be installed in
the subsidiary transaction. I think the requirements would basically be
(1) xmin <= any listed XIDs < xmax
(2) xmin not so old as to cause GlobalXmin to decrease
(3) xmax not beyond current XID counter
(4) XID list includes all still-running XIDs in the given range

One tricky part would be ensuring GlobalXmin doesn't decrease when the
snap is installed, but I think that could be made to work if we take
ProcArrayLock exclusively and insist on observing some other running
transaction with xmin <= proposed xmin. For the pg_dump case this would
certainly hold since xmin would be the parent pg_dump's xmin.

Given the checks stated above, it would be possible for someone to
install a snapshot that corresponds to no actual state of the database,
eg it shows some T1 as running and T2 as committed when actually T1
committed before T2. I don't see any simple way for the installation
function to detect that, but I'm not sure whether it matters. The user
might see inconsistent data, but do we care? Perhaps as a safety
measure we should only allow snapshot installation in read-only
transactions, so that even if the xact does observe inconsistent data it
can't possibly corrupt the database state thereby. This'd be no skin
off pg_dump's nose, obviously. Or compromise on "only superusers can
do it in non-read-only transactions".

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-12-05 18:33:39 Re: wCTE behaviour
Previous Message Greg Smith 2010-12-05 18:12:57 Re: new patch of MERGE (merge_204) & a question about duplicated ctid