Re: Snapshot synchronization, again...

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Snapshot synchronization, again...
Date: 2011-01-19 05:12:39
Message-ID: AANLkTinvmf9yaXQp7BijTkodChUQ5Nb3wvx5riD5BxBM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah, thank you for this excellent review. I have taken over most
(allmost all) of your suggestions (except for the documentation which
is still missing). Please check the updated & attached patch for
details.

On Fri, Jan 14, 2011 at 10:13 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> "00000000000000000000" is a valid md5 message digest.  To avoid always accepting
> a snapshot with that digest, we would need to separately store a flag indicating
> whether a given syncSnapshotChksums member is currently valid.  Maybe that hole
> is acceptable, though.

In the end I decided to store md5 checksums as printable characters in
shmem. We now need a few more bytes for each checksum but as soon as a
byte is NUL we know that it is not a valid checksum.

>> +
>> +     if (!XactReadOnly)
>> +             elog(WARNING, "A snapshot exporting function should be readonly.");
>
> There are legitimate use cases for copying the snapshot of a read-write
> transaction.  Consider a task like calculating some summaries for insert into a
> table.  To speed this up, you might notionally partition the source data and
> have each of several workers calculate each partition.  I'd vote for removing
> this warning entirely.

Warning removed, adding this fact to the documentation only is fine with me.

> InvalidBackendId is -1.  Is InvalidOid (0) in fact also invalid as a backend ID?

Yes, there is a check in utils/init/postinit.c that makes sure that 0
is never a valid backendId. But anyway, I have now made this more
explicit by adding an own parse routine for ints.

>> +     while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId)
>> +     {
>> +             if (xid == GetTopTransactionIdIfAny())
>> +                     continue;
>> +             snapshot->xip[i++] = xid;
>
> This works on a virgin GetSnapshotData() snapshot, which always has enough space
> (sized according to max_connections).  A snapshot from CopySnapshot(), however,
> has just enough space for the original usage.  I get a SIGSEGV at this line with
> the attached test script.  If I change things around so xip starts non-NULL, I
> get messages like these as the code scribbles past the end of the allocation:

This has been fixed. xip/subxip are now allocated separately if necessary.

> Though unlikely, the other backend may not even exist by now.  Indeed, that
> could have happened and RecentGlobalXmin advanced since we compared the
> checksum.  Not sure what the right locking is here, but something is needed.

Good catch. What I have done now is a second check at the end of
ImportSnapshot(). At that point we have set up the new snapshot and
adjusted MyProc->xmin. If we succeed, then we are fine. If not we
throw an error and invalidate the whole transaction.

>> +      * Instead we must check to not go forward (we might have opened a cursor
>> +      * in this transaction and still have its snapshot registered)
>> +      */
>
> I'm thinking this case should yield an ERROR.  Otherwise, our net result would
> be to silently adopt a snapshot that is neither our old self nor the target.
> Maybe there's a use case for that, but none comes to my mind.

This can happen when you do:

DECLARE foo CURSOR FOR SELECT * FROM bar;
import snapshot...
FETCH 1 FROM foo;

> I guess the use case for pg_import_snapshot from READ COMMITTED would be
> something like "DO $$BEGIN PERFORM pg_import_snapshot('...'); stuff; END$$;".
> First off, would that work ("stuff" use the imported snapshot)?  If it does
> work, we should take the precedent of SET LOCAL and issue no warning.  If it
> doesn't work, then we should document that the snapshot does take effect until
> the next command and probably keep this warning or something similar.

No, this will also give you a new snapshot for every query, no matter
if it is executed within or outside of a DO-Block.

>> + Datum
>> + pg_import_snapshot(PG_FUNCTION_ARGS)
>> + {
>> +     bytea  *snapshotData = PG_GETARG_BYTEA_P(0);
>> +     bool    ret = true;
>> +
>> +     if (ActiveSnapshotSet())
>> +             ret = ImportSnapshot(snapshotData, GetActiveSnapshot());
>> +
>> +     ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot());
>
> Is it valid to scribble directly on snapshots like this?

I figured that previously executed code still has references pointing
to the snapshots and so we cannot just push a new snapshot on top but
really need to change the memory where they are pointing to.

I am also adding a test script that shows the difference of READ
COMMITTED and SERIALIZABLE in an importing transaction in a DO-Block.
It's based on the script you sent.

So thanks again and I'm looking forward to your next review... :-)

Joachim

Attachment Content-Type Size
syncsnapshots.2.diff text/x-patch 16.8 KB
test3.pl text/x-perl 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-01-19 05:14:00 Re: pg_basebackup for streaming base backups
Previous Message Fujii Masao 2011-01-19 04:12:21 Re: pg_basebackup for streaming base backups